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Title: Process for producing fusion proteins comprising ScFv fragments by a 
transformed mould 

The present invention relates to the production of a Single Chain antibody 
5 fragment (ScFv fragment) by a transformed mould. In this specification an ScFv 
fragment stands for a variable fragment of a heavy chain connected by a linker 
peptide to a variable fragment of a light chain. 

Background of the invention 

10 It has been described that ScFv fragments can be produced in various transformed 
microorganisms, but with various degrees of success. For example, from WO 
93/02198 (TECH. RES. INST. HNLAND; Teeri c.s.) published 04.02.93 it is 
known that ScFv fragments can be produced and secreted in several host organisms 
(although it is only exemplified in £. coU and S. cerevisiae), provided that a special 

15 linker is used between the heavy chain and the light chain fragments. That linker 
comprises a flexible hinge region of a naturally secreted multidomain protein or an 
analogue thereof not being homologous to either of the heavy or light chain 
fragments. This WO 93/02198 is incorporated herein by reference. A serious 
limitation of the method disclosed in WO 93/02198 is the low production level 

20 shown, which is far below the production level required for the application of ScFv 
fragments in consumer products at a reasonable price. Examples of such consumer 
products include detergent products, food products, and products for the personal 
care of people like toilet soap and under arm hygienic products. Thus there is a 
need for a more universal high-yielding production system for ScFv fragments. 

25 The production of an ScFv fragment in E. coli bacteria gives relatively low yields 
and there is a need for solubilization and subsequent renaturation of the proteins 
formed inside the bacteria, which makes this method not attractive for production 
of antibody fragments that need be used in relatively large amounts (see page 3, 
lines 5-23 of WO 93/02198). When attempting to produce various ScFv fragments 

30 in yeasts using expression systems, that have produced various heterologous 

enzymes in amounts sufficient for economical application in consumer goods, the 
present inventors found that the ScFv fragments were not secreted or only in very 
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minute quantities. This appears to be in agreement with Example 2 on pages 29-31 
of WO 93/02198 which relates to the production of an ScFv fragment in yeast 
without indicating the amou nt produced. Although in WO 93/02198 many 
alternative linkers are mentioned, it is stated on page 6 of WO 93/02198 that 
5 "... there are no published reports of the analysis or design of secretable linker 
peptides." and "... there are no published examples to date of novel fusion proteins 
with added heterologous linker sequences which are secreted to the culture 
medium of the host." 

10 In another recent publication, namely in WO 92/01797 (OY ALKO AB), published 
06.02.92, the production of immunoglobulins in the mould Tricliodenna is 
described. In Example 20 on pages 83-85 and Figure 27 the construction and 
expression of a functional gene encoding a single chain antibody containing 
variable regions of both a light and heavy chain linked to each other by a flexible 

15 hinge region of CBHI is described (CBHI is cellobiohydrolase I present in large 
amounts in the culture medium of Trichoderma reesei] see page 3 of WO 
92/01797). The gene was under control of a T. reesei cbhi terminator and either a 
r. reesei cbhi promoter (plasmid pEN401) or Aspergillus gpd promoter (plasmid 
pEN402). The plasmids were transformed to Tricliodenna reesei strain RlJT-C-30 

20 (ATCC 56765) and the transformants were grown in two different media. 
Expression of immunoreactive single chain antibodies was tested from culture 
supematants but no results were mentioned . Thus it was not demonstrated that 
any amount of single chain antibodies was actually formed. This conclusion is in 
agreement with a later related publication of Nyyssonen et al ex VTT Biotechnical 

25 Laboratory, Finland (1993) in which partially the same experiments are described 
with plasmids pEN304, pAJ202 and pEN209 encoding the 23,3 kD light chain, the 
23.9 kD heavy Fd chain and the 73.2 kD CBHI-heavy Fd chain, respectively, which 
plasmids are also exemplified in WO 92/01797. In this publication only the 
production of a separate light chain or a separate heavy chain, as such or as a 

30 precursor, by a Trichoderma reesei strain is described, but the production of an 
ScFv fragment containing a light chain connected via a linker peptide to a heavy 
chain is not described. 
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Therefore, there is still a need for an alternative production and secretion system 
for ScFv fragments in a mould that gives at least a reasonable yield of the desired 
ScFv fragment. The present invention provides such production using a 
transformed mould of the genus Aspergillus. 

5 

According to M. Ward et al (1990), see also GENENCOR's WO 90/15860 
published 27.12.90, the production in Aspergillus of a desired protein and 
subsequent secretion can be improved when a fusion protein comprising the 
desired protein and a mould protein is produced. This was exemplified with the 

10 production of prochymosin fused with its amino terminus to the carboxyl terminus 
oiA. awcmtori glucoamylase. However, that publication does not give any 
suggestion that such an approach would also be suitable for the production of ScFv 
fragments, which are known as compounds presenting great difficulties when one 
attempts to obtain their production and secretion by a microbial host (see the 

15 above mentioned WO 93/02198). 



In UNILEVER'S not prior-published WO 93/12237, now published 24.06.93 and 
claiming a priority date of 09.12.91, a process for the production and secretion of a 
desired protein by a transformed mould is described, in which the expression 

20 and/or secretion regulating regions are derived from the endoxylanase 11 gene 
{exLi gene) of Aspergillus niger var. awamori present on plasmid pAW14B (see 
Figure 3 of WO 93/12237), which is present in a transformed E. coli strain JM109 
deposited under the Budapest Treaty at the Centraalbureau voor Schimmelcultures 
in Baarn, The Netheriands, as N** CBS 237.90 on 31 May 1990. In a preferred 

25 embodiment the desired protein can be part of a fusion protein comprising the 
desired protein preceded at its NH2-terminus by at least part of the endoxylanase 
II protein. No mention is made of the production of ScFv fragments. 

Summary of the invention 

30 The present invention provides a process for producing fusion proteins comprising 
ScFv fragments by a transformed mould, in which (a) the mould belongs to the 
genus Aspergillus, and (b) the Aspergillus contains a DNA sequence encoding the 
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ScFv fragment under control of at least one expression and/or secretion regulating 
region derived from a mould selected from the group consisting of promoter 
sequences, terminator sequences and signal sequence-encoding DNA sequences, 
and functional derivatives or analogues thereof, optionally followed by a proteolytic 

5 cleavage step for separating the ScFv fragment part from the fusion protein. In one 
embodiment the "at least one expression and/or secretion regulating region derived 
from a mould" comprises the combination of both a promoter sequence and a 
signal sequence-encoding DNA sequence derived from a glucoamylase gene ex 
Aspergillus plus a terminator sequence of a trpC gene ex Aspergillus or at least one 

10 functional derivative or analogue thereof. In another embodiment the "at least one 
expression and/or secretion regulating region derived from a mould" is selected 
from a promoter, a signal sequence-encoding DNA sequence and a terminator 
sequence derived from an endoxylanase gene ex Aspergillus^ especially from the 
endoxylanase II gene (exlA gene) of Aspergillus niger var. awamori present on the 

15 above mentioned plasmid pAW14B or at least one functional derivative or 
analogue thereof. 

In a preferred embodiment of the present invention the DNA sequence encoding 
the ScFv fragment forms part of a chimeric gene encoding a fusion protein, 
whereby said DNA sequence encoding the ScFv fragment is preceded at its 5* end 

20 by at least part of a structural gene encoding the mature part of a secreted mould 
protein, especially a mature Aspergillus protein, e.g. the mature glucoamylase 
protein or the mature endoxylanase protein. If the ScFv fragment in the fusion 
protein is connected or bound to said secreted mould protein or part thereof by a 
proteolytic cleavage site, e.g. a KEX2-like site, it is possible to remove the mould 

25 protein or part thereof from the ScFv fragment, so that the resulting antibody 
fragment is as small as possible, which can have significant advantages in 
applications. In this case the process according to the invention includes a 
proteolytic cleavage step for separating the ScFv fragment part from the fusion 
protein following the production of the fusion protein containing the ScFv 

30 fragment. It was found that production levels of at least 40 mg ScFv fragment per 
litre, or even at least 60 mg/1, and a highest yield of slightly more than 90 mg/1 
could be obtained (see Table 2 below), but it is envisaged that after further 
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optimization at least 150 mg/1 can be achieved by cultivation in shaked flasks. 
Further, production levels of more than 150 mg ScFv fragment per litre were 
already obtained with cultivation in a fermenter; it is therefore envisaged that after 
further optimization at least 250 mg/1, or even at least 500 mg/1, and probably 
5 more than at least 1 g/1 will be obtainable . 

The invention also provides new products comprising an ScFv fragment or fusion 
product thereof obtainable by a process according to the invention. Such new 
product can be one in which the ScFv fragment is a modified ScFv fragment 

10 comprising complementary determining regions (CDRs) grafted on the framework 
regions of the variable fragments of an other ScFv fragment that is well expressed 
and secreted by a lower eukaryote, especially a mould of the genus Aspergillus. 
The invention also provides a composition, in particular consumer products of 
which examples are given above, containing a product produced by a process 

15 according to the invention or a new product as described above. According to a 
special embodiment of the invention the ScFv fragment recognizes a compound 
present in the human eco-system, which compound can be a microorganism, an 
enzyme or another protein. One preference is for compounds present in the oral 
cavity, and more preferably for compounds involved in the formation of plaque, 

20 caries, gingivitis, periodontal diseases, or bad breath. Another preference is for 
compounds present on the human skin, more preferably compounds involved in the 
formation of malodour, inflammation or hair loss. Another special embodiment of 
the invention relates to a composition, which can be used for diagnostic purposes 
and in which the compound is a hormone, especially human chorionic 

25 gonadotropin (HCG). 

According to another embodiment of the invention the ScFv fragment recognizes a 
compound present in the eco-system of domestic and agricultural animals which 
compound can be an animal feed component, an enzyme or another protein, or a 
disease causing agent 

30 According to still another embodiment of the invention a composition is provided 
in which the ScFv fragment recognizes a compound that has a positive or negative 
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relationship with a disease or disorder and can for example be used for detection 
and/or targeting purposes. 

The invention also relates to a composition according to the invention which can 
be used in the chemical, petrol or pharmaceutical industry as a catalyst or for 
5 detection purposes. 

Although the invention was developed on the basis of the production of ScFv 
fragments in a mould of the genus Aspergillus, as will be illustrated in the Examples 
below, it is envisaged that the invention will also be applicable to other moulds, 
especially selected from the genera Mucor, Neurospora, and Penicillium. 

10 

Brief description of the figures 

Figure 1 Schematic drawing of pAN52-10. 
Figure 2 Schematic drawing of pUR4155 and pUR4157. 
Figure 3 Schematic drawing of pAN56-7. 
15 Figure 4 Schematic drawing of pUR4159 and pUR4161. 

Figure 5 Western blot. After gelelectrophoresis on a 12.5% SDS-PAGE gel 

proteins reacting with Fv-lyso2yme antiserum are visualized. 

Lane 1: E. coli extract containing ScFv-lysozyme; Lane 2: Fv-lysozyme; 

Lanes 3 to 8 contain medium samples of AWC(M)41 transformants and 
20 the A. niger var. awamori mutant #40 strain; Lane 3 and 4: transformant 

AWC(M)4161 (prepro-"glaA2"-KEX-ScFv-HCG); Lane 5: AWC4159 

(prepro-"glaA2"-KEX-ScFv-LYS); Lane 6: mutant #40; Lane 7: 

AWC4157 (18aa glaA-ScFv-HCG); Lane 8: AWC4155 (18aa glaA-ScFv- 

LYS). 

25 Figure 6 Map of plasmid pAW14B obtained by insertion of the 5.3 kb SaR 

fragment comprising the exlA gene of Aspergillus niger var. awamori in the 
5a/IsiteofpUC19. 

Figure 7 Coomassie Brilliant Blue-stained polyacrylamide gel showing proteins 
present in the culture medium of an Aspergillus niger var. awamori 
30 transformed with pUR4462; also indicated are the bands representing 

(i) the released ScFv-LYS fragment, and 
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(ii) the glaA-KEX2-ScFv-LYS fusion protein and/or the truncated 
glaA protein. 

Detailed description of the invention 
5 It has now been found that the development described above by M, Ward et al. 
(1990) and in WO 90/15860 (in which the gene .encoding the desired protein forms 
part of a chimeric gene further comprising a gene encoding the glucoamylase 
protein) as well as the above described preferred embodiment of the invention 
described in UNILEVER^s above mentioned not prior-published WO 93/12237 (in 

10 which the gene encoding the desired protein forms part of a chimeric gene further 
comprising a gene encoding at least part of the endoxylanase protein) can be 
applied advantageously for the production of ScFv fragments, so that the desired 
protein is the ScFv fragment. This is particularly so, when in the resulting fusion 
protein a proteolytic cleavage site is present between the secreted mould protein 

15 part or fragment thereof and the ScFv part. A preferred cleavage site is a KEX2- 
like site as described by Fuller et al (1988), Contreras et al (1991) and Calmels et 
al (1991), but other cleavage sites can also be used provided that they are not 
present in the ScFv fragment. Other cleavage sites can be selected on the basis of 
the method described by Matthews & Wells (1993). In the Examples given below 

20 the pro part of the prepro-glucoamylase protein comprises a KEX2-type 
recognition site, see Example 2.4 (i). 

ScFv fragments that recognize microorganisms present in the oral cavity or on the 
skin of human beings are important in the framework of this invention, because 

25 they have potential to inhibit the growth or metabolism of these microorganisms. 
Certain microorganisms present in the oral cavity are thought to be involved in the 
formation of plaque, caries, gingivitis or periodontal diseases, etc., whereas 
microorganisms on the human skin are involved in, amongst others, the generation 
of malodour. The ScFv fragments prepared according to the invention may exert 

30 their action either as such, or bound to other compounds that have an inhibitory 
effect on said microorganisms. 



SUBSTITUTE SHEET (RULE 25) 



wo 94/29457 



8 



PCT/EP94/01906 



It is also envisaged that according to the present invention other modified ScFv 
fragments can be made by grafting a complementary determining region (CDR) on 
the framework regions of the variable fragments of an ScFv fragment that is well 
expressed and secreted in Aspergillus; compare grafting of CDR's on human im- 
5 munoglobulins as described by e.g. Jones et al, (1986). These CDR's can be 

obtained from common antibodies. Both the binding properties of a CDR and the 
remainder of the ScFv fragment can be optimized by random or directed 
mutagenesis. Thus in a process according to the invention CDR's originating from 
one antibody can be grafted on the framework regions of the variable fragments of 
10 another ScFv fragment. 

Some ScFv fragments or fusion products thereof produced by a process according 
to the invention may be old, but many of the ScFv fragments or fusion products 
thereof will be new products. Thus the invention also provides new ScFv fragments 
15 or fusion products thereof obtainable by a process according to the invention. 
The products resulting from such process can be used in compositions for various 
applications. Therefore, the invention also relates to compositions containing a 
product produced by a process according to the invention. This holds for both old 
products and new products. 

20 

Instead of the combination of an exlA promoter, an exlA signal sequence-encoding 
DNA sequence, and an exlA terminator exemplified in Examples 3 and 5, also 
other combinations can be used e.g. an exlA promoter, an glaA signal sequence- 
encoding DNA sequence, and an exW terminator as exemplified in Example 7, but 

25 in general a selection can be made from any mould-derived promoter, mould- 
derived signal sequence-encoding DNA sequence, and mould-derived terminator 
sequence as expression and/or secretion regulating regions. A specific embodiment 
is a combination of both a promoter sequence and a signal sequence-encoding 
DNA sequence derived froin a glucoamylase gene ex Aspergillus plus a terminator 

30 sequence of a trpC gene ex Aspergillus. 

The secreted mould protein forming part of a fusion protein according to the 
invention can in general be derived from any secreted mould protein in addition to 
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the exemplified endoxylanase 11 protein ex Aspergillus niger var. awamori (see 
Examples 3 and 5) and the exemplified glucoamylase ex Aspergillus (see Example 
7). 

Table 2 in Example 2.6.1b shows that the highest expression and secretion yield 
5 was obtained when the mould protein was composed of its prepro part followed by 
an appreciable part of its mature protein, which was connected to the ScFv 
fragment by again the pro part of the mould protein containing a KEX2-Uke 
cleavage site. A small linker peptide may be situated between the ScFv fragment 
and the KEX2-like cleavage site (see plasmids pUR4159 and pUR4163 and 

10 derivatives) or between the latter and the part of the mature mould protein. 
Thus in its broadest sense the invention provides a process for producing fusion 
proteins comprising ScFv fragments by a transformed mould, in which the mould 
belongs to the genus AspergilluSy and the Aspergillus contains a DNA sequence 
encoding the ScFv fragment under control of at least one expression and/or 

15 secretion regulating region derived from a mould selected from the group 
consisting of promoter sequences, terminator sequences and signal sequence- 
encoding DNA sequences, or functional derivatives or analogues thereof. 

The invention will be illustrated by the following Examples. 

20 

Example 1 Isolation of the antibody gene fragments encoding the V^, and 

regions and the construction of ScFv genes. 
The isolation of RNA from the hybridoma cell lines, the preparation of cDNA and 

25 amplification of gene fragments encoding the variable regions of the heavy (Vj|) 
and light (VJ chains of the antibodies by PGR, was performed according to 
standard procedures known from the literature (see e.g. Orlandi et al, 1989). The 
general procedures described in the Examples were performed according to 
Sambrook et al, unless otherwise indicated. 

30 After cloning the V,, and Vl gene fragments and determining the nucleotide 
sequence, they can be used to construct expression plasmids encoding e.g. Fv or 
ScFv antibody fragments. In the ScFv antibody fragments, the V,i and the Vl 
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chains are connected via a peptide linker. This is achieved by constructing a 
(chimeric) gene in which the gene fragments encoding the and chains are 
connected with a nucleotide sequence encoding the linker peptide. The order of 
the variable chains can be Vn-linker-VL or VL-linker-VH- In the following 
5 experiments the peptide linker with the sequence (GGGGS)3 is used (SEQ. ID. 
NO: 1). 

1.1 Construction of ScFv anti-Iysozyme 

Plasmid pScFv-LYS-myc was obtained from G. Winter and was described by S. 

10 Ward et al., (1989). This pUC19-derived plasmid contains a gene fragment 

encoding the V„ and fragments of the anti-Hen egg white lysozyme antibody 
D1.3. The fragment is preceded by the PelB secretion signal sequence, the 
and Vl fragments are connected via the (GGGGS)3 peptide linker (SEQ. ID. NO: 
1) and the Vl fragment is extended with an 11 amino acids myc-tag. The 

15 nucleotide sequence (SEQ. ID. NO: 2) and the deduced amino acid sequence 
(SEQ. ID. NO: 3) of the Hindm-EcdKl fragment encoding the ScFv fragment of 
the monoclonal anti-lysozyme antibody D1.3, preceded by the PelB signal sequence 
and followed by the myc-tail are given below. 



35 



40 



20 Nucleotide and deduced amino acid sequence of ScFv-LYS-myc 

Hindlll • • 
1 AAGCTTGCATGCAAATTCTATTTCAAGGAGACAGTCATAATGAAATACCT 5 0 

M K Y L 
> PelB ss 

25 

5 1 ATTGCCTACGGCAGCCGCTGGATTGTTATTACTCGCTGCCCAACCAGCGA 100 
LPTAAAGLLLLAAQPA 

30 . PstI .... 

101 TGGCCCAGGTGCAG CTGCAG GAGTCAGGACCTGGCCTGGTGGCGCCCTCA 150 
MAQVQLQESGPGLVAPS 
> Vh 



151 CAGAGCCTGTCCATCACATGCACCGTCTCAGGGTTCTCATTAACCGGCTA 200 
QSLSITCTVSGFSLTGY 

> 
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• • • • • 

TGGTGTAAACTGGGTTCGCCAGCCTCCAGGAAAGGGTCTGGAGTGGCTGG 250 

GVNWVRQPPGKGLEWL 
CDR I < 

• • • • • 

GAATGATTTGGGGTGATGGAAACACAGACTATAATTCAGCTCTCAAATCC 300 
GMIWGDGNTDYNSALKS 
> CDR II < 



■ • • • • 

AGACTGAGCATCAGC AAGGACAACTCCAAGAGCCAAGTTTTCTTAAAAAT 350 
RLSISKDNSKSQVFLKM 

• • • • • 

GAACAGTCTGCACACTGATGACACAGCCAGGTACTACTGTGCCAGAGAGA 400 
NSLHTDDTARYYCARE 

> 

BstEII 

GAGATTATAGGCTTGACTACTGGGGCCAAGGCACCACGGTCACCGTCTCC 450 
RDYRLDYWGQGTTVTVS 
CDR III < 



• « • • • 

TCAGGTGGAGGCGGTTCAGGCGGAGGTGGCTCTGGCGGTGGCGGATCGGA 500 
SGGGGSGGGGSGGGGSD 
> Linker < > 

Sqc"!. « • • • • 

CATCGAGCTCACTCAGTCTCCAGCCTCCCTTTCTGCGTCTGTGGGAGAAA 550 
lELTQSPASLSASVGE 
VI 



• • • • • 

CTGTCACCATCACATGTCGAGCAAGTGGGAATATTCACAATTATTTAGCA 600 
TVTITCRASGNIHNYLA 

> CDR I < 



• • • • • 

TGGTATCAGCAGAAACAGGGAAAATCTCCTCAGCTCCTGGTCTATTATAC 650 
WYQQKQGKSPQLLVYYT 



• • • • • 

AACAACCTTAGCAGATGGTGTGCCATCAAGGTTCAGTGGCAGTGGATCAG 700 
TTLADGVPSRFSGSGS 
CDR II < 



• • • • • 

GAACACAATATTCTCTCAAGATCAACAGCCTGCAACCTGAAGATTTTGG6 750 
GTQYSLKINSLQPEDFG 

• • • • • 

AGTTATTACTGTCAACATTTTTGGAGTACTCCTCGGACGTTCGGTGGAGG 800 
SYYCQHFWSTPRTFGGG 
> CDR III < 
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Xhol .... 

801 CACCAAGCTCGAGATCAAACGGGAACAAAAACTCATCTCAGAAGAGGATC 850 
TKLEIKREQKLISEED 

> myc tail 

5 

. Bell . BamKl EcoRl 

851 TGAATTAATA ATGATCAA ACGGTAATA AGGATCCA GCTCGAATTC 895 
L N * * * 



In order to remove the myc-tag of pUC19-derived pScFv-LYS-myc the XhohEcoRl 

fragment was replaced by a new synthetic fragment having the following sequence : 

E I K R * * (SEQ. ID. NO: 6) 

5«- TC GAG ATC AAA CGG TAA TGA G -3' (SEQ, ID. NO: 4) 

15 3»- C TAG TTT GCC ATT ACT CTT AA -5* (SEQ. ID. NO: 5) 

Xhol EcoRl 

introducing a TAA translation termination codon after the VL-gene fragment. The 
obtained plasmid was named pUR4121. Subsequently, the about 820 bp HindUh 
EcoRI fragment encoding the ScFv-LYS was isolated and cloned into a pEMBL9- 
20 derived plasmid (Dente et aL, 1983), which was digested with the same enzymes, 
resulting in plasmid pUR4129. 



\2 Construction of a gene encoding ScFv anti-human chorionic 
gonadotropin 

25 Human chorionic gonadotropin (HCG) is a pregnancy hormone. A pregnanq^ test 
kit based on the detection of HCG in urine by using monoclonal antibodies was 
developed by Unilever and is marketed by UNIPATH under the trade name 
Clearblue®. Gene fragments, encoding the variable regions of the heavy and light 
chain fragments from the monoclonal antibody directed against the human 

30 chorionic gonadotropin were obtained from a hybridoma cell line in a way as 

described above. Subsequently, these HCG V„ and Vl gene fragments were cloned 
into plasmid pUR4129 by replacing the corresponding PstVBstEW and SacVXhol 
anti-lysozym gene fragments, resulting in plasmid pUR4138. The nucleotide 
sequence (SEQ. ID. NO: 7) and the deduced amino acid sequence (SEQ. ID. NO: 

35 8) of the Pst\-Xtio\ gene fragment encoding the ScFv fragment of the anti-human 
chorionic gonadotropin (anti-HCG) antibody is given below. 
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Nucleotide sequence and deduced amino acid sequence of ScFv-HCG 

Ps^l. • • • • • 

1 CTGCAGGAGTCTGGGGG ACACTTAGTGAAGCCTGGAGGGTCCCTGAAACT 5 0 
LQESGGHLVKPGGSLKL 



5 1 CTCCTGTGCAGCCTCTGGATTCGCTTTCAGTAGCTTTGACATGTCTTGGA 100 
SCAASGFA FSSFDMSW 
10 > CDR I < 



101 TTCGCCAGACTCCGGAGAAGAGGCTGGAGTGGGTCGCAAGCATTACTAAT 150 
IRQTPEKRLEWVASITN 
15 > 



151 GTTGGTACTTACACCTACTATCCAGGCAGTGTGAAGGGCCGATTCTCCAT 200 
VG TYTYYPGSVKGRFSI 
20 CDR II < 



201 CTCCAGAGACAATGCCAGGAACACCCTAAACCTGCAAATGAGCAGTCTGA 250 
25 SRDNARNTLNLQMSSL 



251 GGTCTGAGGACACGGCCTTGTATTTCTGTGCAAGACAGGGGACTGCGGCA 300 
RSEDTALYFCARQGTAA 
30 > 

. BstEII 

301 CAACCTTACTGGTACTTCGATGTCTGGGGCCAAGGGACCACGGTCACCGT 350 
QPYWYFDVWGQGTTVTV 
35 CDR III < 



351 CTCCTCAGGTGGAGGCGGTTCAGGCGGAGGTGGCTCTGGCGGTGGCGGAT 400 
SSGGGGSGGGGSGGGG 
40 > Linker 

SQ-cI. • • • • 

401 CGGACATCGAGCTCACCCAGTCTCCAAAATCCATGTCCATGTCCGTAGGA 450 
SDIELTQSPKSMSMSVG 
45 < > VI 



451 GAGAGGGTCACCTTGAGCTGCAAGGCCAGTGAGACTGTGG ATTCTTTTGT 500 
ERVTLSCKASETVDSFV 
50 > CDR I 
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501 GTCCTGGTATCAACAGAAACCAGAACAGTCTCCTAAATTGTTGATATTCG 550 
SWYQQKPEQSPKLLIF 
< > 



551 GGGCATCCAACCGGTTCAGTGGGGTCCCCGATCGCTTCACTGGCAGTGGA 600 
GASNRFSGVPDRFTGSG 
CDR II < 

10 

, • • • • 

601 TCTGCAACAGACTTCACTCTGACCATCAGCAGTGTGCAGGCTGAGGACTT 650 
SATDFTLTISSVQAEDF 

15 • . • • • 

651 TGCGGATTACCACTGTGGACAGACTTACAATCATCCGTATACGTTCGGAG 700 
ADYHCGQTYNHPYTFG 
> CDR III < 

20 . Xhol 

701 GGGGGACCAAGCTCGAG 717 
G G T K L E 



25 



Example 2 Construction of ScFv expression cassettes, using the glaA 

promoter system and introduction into Aspergillus. 
2.1 Construction of ScFv expression cassettes using the 18 amino acid signal 
30 sequence of glucoamylase (pUR4155 and pUR4157) 

The multiple cloning site of plasmid pEMBL9 (ranging from the EcoRl to the 
HindUl site) was replaced by a synthetic DNA fragment having the following 
nucleotide sequence. 



35 Nucleotide sequence for synthetic EcdRhHindUl fragment cloned in pEMBL9 and 
used for preparing pUR4153 

18 amino acid signal sequence of 

MGFRSLLALSGLV 
AAT TC C ATG GGC TTC CGA TCT CTA CTC GCC CTG AGC GGC CTC GTC — 
40 GG TAG C CG AAG GCT AGA GAT GAG CGG GAC TCG CCG GAG CAG ~ 

EcoRI Ncol 



45 
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alucoamvlase 

C T G L A 
~ TGC ACA GGG TTG GCA 
— ACG TGT CCC AAC CGT 



N-term ScFv 

Q V Q L Q 
GAG GTG CAG CTG GAG 
GTC CAC GTC GAG GTC 
Pstl 



C-term 
* V T K 
TAA GTG ACT AAG ~ 
ATT CAC TGA TTC — 



ScFv 

L E I K R * * (SEQ. ID. NO: 11-12) 

10 ~ CTC GAG ATC AAA CGG TGA TA (SEQ. ID. NO: 9) 

— GAG CTC TAG TTT GCC ACT A TT CGA (SEQ. ID. NO: 10) 

Xhol HiJidlll 



15 The 5'-part of the nucleotide sequence codes for the glaA signal sequence (amino 
acid 1 to 18), followed by the first 5 amino acids of the variable part of the 
antibody heavy chain. The 3'-part encodes the last 5 amino acid residues of the 
variable part of the antibody light chain. The resulting plasmid was named 
pUR4153. 

20 Plasmids pUR4154 and pUR4156 were obtained in the following way: After 

digestion of plasmid pUR4129 (Example 1.1) with Pstl and Xhol, an about 0,7 kb 
DNA fragment was isolated from agarose gel. This fragment codes for a truncated 
ScFv-LYS fragment missing DNA sequences encoding the 5 N-terminal and 5 C- 
terminal amino acids. In the same way an about 0.7 kb Pstl-Xhol fragment was 

25 isolated from plasmid pUR4138 (Example 1.2), which encodes for a similarly 
truncated ScFv-HCG fragment. 

In order to fuse the ScFv encoding fragments with the glaA secretion signal- 
encoding sequence, the obtained fragments were cloned into pUR4153. To this end 
plasmid pUR4153 was digested with Pstl and Xltol, after which the about 4.1 kb 
30 vector fragment was isolated from an agarose gel. Ligation with the about 0.7 kb 
PsthXhol fragments resulted in plasmids pUR4154 (ScFv-LYS) and pUR4156 
(ScFv-HCG), respectively. 

2.2 Construction of pAN52.10 
35 pAN52-10 (Figure 1) was used as starting vector for the construction of the 
Aspergillus expression cassettes. This plasmid was constructed as follows: 
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In pAN52-6Afort (Van den Hondel et aL, 1991) the Ncol site located in the glaA 
promoter of A. niger N402 (about 2.7 kb upstream of the ATG) was removed by 
cleaving with Ncol and filling in with Klenow polymerase, resulting in pAN52- 
6Notl delta Ncol. After digestion of pAN52-6Afort delta Ncol with Notl and partial 

5 digestion with Xmnl an about 4.0 kb NotVXmnl glaA promoter fragment was 
isolated. Three-way ligation of this pAN52-6ArorI delta Ncol fragment (1) with an 
about 3.4 kb NothNcol fragment (2) of pAN52-lM?/I (Van den Hondel, C.A.MJ.J. • 
et al\ 1991), comprising the A nidulans trpC terminator (Punt, J.P. et al.\ 1991) and 
pUC18-sequences, and with a synthetic J¥mwl-M:al fragment (3) comprising the 3'- 

10 end of the glaA promoter to the ATG initiation codon, resulted in plasmid pAN52- 
INotl, The nucleotide sequence (SEQ. ID. NO: 13-14) of this synthetic Xmnl-Ncol 
fragment is given below. 



5 ' ■ GCT TC C TCC CTT TTA GAC GCA ACT GAG AGC CTG 

15 3«- CGA AGG AGG GAA AAT CTG CGT TGA CTC TCG GAC 

Xmnl 

AGG TTC ATC CCC AGC ATC ATT ACA CCT GAG C 

TCG AAG TAG GGG TGG TAG TAA TGT GGA GTC GGT AC 

20 Ncol 



After isolating both the about 4 kb Notl-Ncol fragment (comprising the gloA 
promoter) and the about 3.4 kb NothBaniYSI fragment (comprising the pUC18 
vector and the trpC terminator) from pAN52-7A/b/I, the fragments were ligated 
25 together with the Ncol-BamHl linkers containing an EcoRV site and an Hindlll 
site and having the following nucleotide sequences (SEQ. ID. NO: 15-16). 

5'- CAT GG C CGA TAT C GC AAG CTT CCG -3 » 

3»- CG GCT ATA GCG TTC GAA GGC_CTAG_ -5» 

30 Ncol EcoRV Hindlll BamUl 



This resulted in plasmid pAN52-9. Ligation of the about 4.0 kb NothHindUl glaA 
promoter fragment of pAN52-9 with an about 3.3 kb Hindlll-Notl fragment of 
pAN52-6iVort containing both pUC18-sequences and an about 0.7 kb trpC 
35 terminator fragment of A, nidulans resulted in pAN52-10 (Figure 1). 
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2.3 Construction of pUR4155 and pUR4157. 

Plasmid pAN52-10 was digested with Ncdl and HindUl and the dephosphorylated 
vector fragment of about 7.5 kb was isolated. The Ncol site is located downstream 
of the gloA promoter and coincides with the ATG initiation codon. The plasmids 
5 pUR4154 and pUR4156 (see Example 2,1) were digested with Ncol and HindUl 
and the about 0.8 kb fragments coding for the ss-glaA and the ScFv were isolated. 
Ligation of the obtained fragments resulted in plasmids pUR4155 and pUR4157, 
respectively (Figure 2). In these plasmids the expression of the ScFv fragments is 
under the control of ihtA. niger glaA promoter, the 18 amino acid signal sequence 
10 of glucoamylase and the A. ' rddulam trpC terminator. 

2.4 Construction of ScFv expression cassettes using part of glucoamylase as 
a secretion carrier. 

i) Construction of pUR4159 and pUR4161. 

IS Expression cassettes encoding a fusion protein consisting of the glaA prepropart, 
the first 514 amino acids of the mature glucoamylase Gl protein ("glaA2" protein), 
and the ScFv fragments were constructed. In these cassettes the "glaA2" protein 
and the ScFv fragment were intersected by a sequence which encodes the 
propeptide of glucoamylase (Asn-Val-Ile-Ser-Lys-Arg; SEQ. ID. NO: 45) and which 

20 comprises a KEX2-type recognition site (Lys-Arg). To obtain these vectors, plasmid 
pAN56-7 (Figure 3) was constructed by insertion of a 1.9 kb Ncol-EcdRV fragment 
of pAN56-4, comprising part of ihtA. niger glaA gene into the about 7.5 kb Ncol- 
EcoKV fragment of pAN52-10. Plasmid pAN56-4 was not prior-published but its 
description is now available in the publication of M.P. Broekhuijsen, I.E. Mattem, 

25 R. Contreras, J.R. Kinghorn & CA.M.J J. van den Hondel in Journal of 

Biotechnology 31, No.2 (1993) 135-145, which is incorporated herein by reference; 
a copy of the draft paper was attached to the priority documents. 
To obtain in-frame fusions of the "glaA2" protein and the ScFv fragments plasmids 
pUR4154 and pUR4156 were digested with EcoRl and Pstl, after which the vector 

30 fragment of about 4.8 kb was isolated from an agarose gel. The vector was ligated 
with a synthetic EcoRl-Pstl fragment having the following nucleotide sequence 
(SEQ. ID. NO: 17-19). 
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KEX2 spacer N-term ScFv 

ISKRGGSQVQLQ 
AAT TC G ATA TC G AAG CGC GGC GGA TCC CAG GTG CAG CTG CA 

GC TAT AGC TTC GCG CCG CCT AGG GTC CAC GTC G 

5 EcoRl EcdRV BamHl Pstl 



This EcoRI'Pstl fragment was used to replace the fragment encoding the glaA 
signal sequence (see Example 2.1) and to allow an in-frame fusion to the '^gloAT 
gene. From the resulting plasmids, pUR4158 and pUR4160, the EcoRW-HindlU 

10 fragments (about 0.75 kb) were isolated and ligated into the £coRV-////idIII 

fragment of pAN56-7 (about 9.3 kb), resulting in pUR4159 and pUR4161 (Figure 
4, in which the DNA encoding the 24 amino acid prepro glaA part in the 
neighbourhood of the Ncol site was not indicated). In the resulting protein the 
"glaA2" part and the ScFv part are connected by a peptide comprising a KEX2 

15 cleavage site. 



ii) Construction of plJR4163. 

In a similar way a vector was constructed with an expression cassette encoding a 
fusion protein consisting of the "glaA2" protein (preceded by its prepro part) fused 
20 to ScFv-lysozyme and intersected by a factor Xa recognition site. The EcoRl-Pstl 
vector fragment (about 4.8 kb) of pUR4154 was ligated with a synthetic EcoRhPstl 
fragment having the following nucleotide sequence (SEQ. ID. NO: 20-22). 



factor Xa spacer 
25 ISIEGRGGS — 

AAT TC G ATA TC G ATC GAA GGT CGA GGC GGA TCC — 

GC TAT AGC TAG CTT CCA GCT CCG CCT AGG — 

EcoRl EcoRV Bamni 

30 — N-term ScFv 

~ Q V Q L Q 
~ CAG GTG CAG CTG CAG 
~ GTC CAC GTC G 

Pstl 

35 

This EcoRI'Pstl fragment was used to replace the fragment encoding the glaA 
signal sequence and to allow an in-frame fusion to the "gloAT gene. In the • 
encoded protein the "glaA2" part and the ScFv part are connected by a peptide 
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comprising a factor X cleavage site. From the resulting plasmid pUR4162, the 
JEcoRV-/fmdIII fragment (about 0.75 kb) was isolated and ligated into the pAN56- 
7 vector fragment (about 9.3 kb), resulting in pUR4163. 

S 2.5 AspergSJus transformation 

The constructed vectors can be provided with conventional selection markers (e.g. 
amdS or pyrGy hygromycin etc.) and the fungus can be transformed with the 
resulting vectors to produce the desired protein. 



10 Table 1 



Expression vectors for the production of ScFv-anti-lysozym and ScFv-anti-human 
chorionic gonadotropin, resp., controlled by the A. niger gloA promoter and A. 
nidulans trpC terminator with A. nidulans amdS as selection marker 
15 \ 



Plasmids 


ScFv- 


secretion-carrier 


cleavage of 




antibodv 




ScFv-antibodv bv 


pUR4155 


ScFv-LYS 


18 a.a. ss glaA 


signalpeptidase 


pUR4159 


ScFv-LYS 


prepro-"glaA2" 


KEX2-enzyme 


pUR4163 


ScFv-LYS 


as in pUR4159 


factor Xa 


pUR4157 


ScFv-HCG 


as in pUR4155 


signalpeptidase 


pUR4161 


ScFv-HCG 


as in pUR4159 


KEX2-enqnne 



As an example, the Aspergillus nidulans anidS gene (Hynes M.J. et al. 1983) located 
on a 5.0 kb Notl fragment was introduced in the unique Notl sites of the ScFv 
30 expression vectors pUR4155, pUR4157, pUR4159, pUR4161 and pUR4163 
yielding pUR4155NOT, plJR4157NOT, pUR4159NOT, pUR4161NOT and 
pUR4163NOT, respectively (Table 1). The amdS Notl fragment was obtained by 
flanking the EcoRl fragment of pGW325 (Wernars K.; Ph.D. thesis 1986) with the 
following synthetic oligonucleotides. 

35 5'- GGCCGC TGTGCAG -3 » (SEQ. ID. NO: 23) 

3'- CGACACG TCTTAA -5» (SEQ. ID. NO: 24) 

Notl ECORI 
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The constructed pUR4L.NOT vectors (pUR4155NOT, pUR4157NOT, 
pUR4159NOT, pUR4161NOT and pUR4163NOT) were subsequently transferred 
to AspergOlus niger van awamori ATCC 11358 (= CBS 115.52) and a mutant strain 
Aspergillus niger var. awamori # 40 (WO 91/19782) which has been obtained by 

5 mutagenesis of A. niger var. awamori Transformation with pUR41NOT plasmids 
was carried out as described in WO 91/19782 or by means of co-transformation 
with plasmid pAN7-l according to Punt P.J. and Van den Hondel CA.MJJ. 
(1992). pAN7-l comprises the hygromycin resistance gene of £ coli flanked by 
Aspergillus expression signals. The yield of A, niger var. awamori (mutant #40) 

10 protoplasts was 1-5 x lOVg mycelium and the viability was 3-8%. Per 

transformation 3-8 x 10^ viable protoplasts were incubated with 10 M.g plasmid 
DNA purified by the Qiagen method. v4. niger var. awamori mutant #40 AmdS"^ 
transformants were selected and purified on plates with minimal medium and 
acetamide or acrylamide as sole nitrogen source. Direct selection resulted in up to 

15 0.02 mutant #40 transformants per ^g DNA. No A, niger var. awamori 

transformants were obtained. Co-transformation of the mutant #40 strain was 
performed with a mbcture of one of the pUR41..NOT plasmids and pAN7-l DNA 
in a weight ratio of 7:3. pAN7-l co-transformants were selected primarily on 
minimal medium plates containing 100-150 |ig/ml hygromycin, followed by 

20 selection on plates with acetamide. The frequency of Hm*^ colonies was about 2 
transformants per ng, however only 5% of the Hm^ colonies grew well on plates 
with acetamide. 

A. niger var. awamori mutant #40 transformants obtained by direct selection on 
plates with acetamide are called AWC. Mutant #40 co-transformants growing well 
25 on acetamide are called AWCM. 



The following number of (co-)transformants were further analyzed: 

Number of transformants Number of co-transformants 

AWC4155* 3 AWCM4155 3 

30 AWC4157 7 AWCM4157 1 

AWC4159 2 AWCM4159 5 

AWC4161 2 AWCM4161 2 

AWCM4163 2 
* 4155 indicates the presence of plasmid pUR4155NOT in the mutant #40 strain. 
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2.6 ScFv production by Aspergillus transformants 

Analysis of Aspergillus nigervzr. awamori mutant # 40 transformants containing 
ScFv-fragment encoding sequences after culturing in medium with maltodextrin as 
an inducer. 

5 AWC and AWCM transformants were grown in minimal medium (0,05% MgS04, 
0,6% NaNOa, 0,05% KCl, 0,15% KH2PO4 and trace elements) with 5% 
maltodextrin (Sigma Dextrin Corn type I; D-2006). Media were sterilized for 30 
min at 120°C. Fifty ml medium (shake flask 300 ml) were inoculated with 4 x 10^ 
spores/ml, followed by culturing in an air incubator (300 rpm) at 30^C for different 

10 periods. Medium samples were taken after 45 to 50 hours and analyzed by SDS- 
PAGE followed by Western blot analyses. Furthermore a quantitative functional 
test was carried out by performing a Pin-ELISA assay. 

2,6.1 Medium of ScFv-LYS and ScFv-HCG transformants 
15 2.6.1a Western blot analysis and Coomassie Brilliant Blue-stained gels 
Western blot analysis of medium samples of AWC(M)4155 (18 a.a. glaA signal 
sequence-ScFv-LYS) (co-)transformants -in which anti-serum directed against Fv- 
LYS was used- revealed a band with a molecular mass of about 31 kDa which is 
absent in the medium of the mutant strain #40 (Figure 5). The presence of this 
20 band, which runs at the position of a protein with the expected size, points at 
secretion of ScFv-LYS in the culture medium. 

In medium of several AWC(M)4159 (prepro-"glaA2"-KEX2-ScFv-LYS) (co-)trans- 
formants a similar, much stronger, band was found indicating a more efficient 
secretion of ScFv~LYS by these transformants. This protein band was also visible 

25 on Coomassie Brilliant Blue-stained gels. 

In medium samples of AWC(M)4157 (18 aa. glaA signal sequence + ScFv-HCG) a 
faint band was found, while the band in medium of AWC(M)4161 (prepro-"glaA2"- 
KEX2-ScFv-HCG) (co-)transformants was clearly visible (molecular mass about 31 
kDa). The aspecific signals were identical to the ones obtained with ScFv-LYS 

30 transformants. Some of the results are shown in Figure 5 (Western blot). 

Method : SDS-PAGE was carried out on 8-25% gradient gels using the Pharmacia 
Phast system or on homogeneous 12.5% home-made SDS-gels. For Western blot 
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analysis a polyclonal anti-serum against Fv-LYS was used (1:1500) for the 
detection of both ScFv-LYS and ScFv-HCG. 



2.6.1b Analysis by PIN-ELISA 

5 The amount of functional ScFv-LYS (as determined by a PIN-ELISA assay) in the 
medium of AWC(M) transformants is given in Table 2. 



10 



Table 2 


Transformant: 




construct 


ScFv-fragment 








mgjl — _ 


AWCM4155 


#102 


18 a.a. ss-glaA-ScFv-LYS 


15 - 22 -11 


AWCM4155 


#105 


same 


3 


AWC 4155 


# 4 


same . 


10 


AWC4155 


# 5 


same 


2 


AWCM4159 


#101 


prepro-"glaA2"-KEX2-ScFv-LYS 


91 - 66 - 67 


AWCM4159 


#608 


same 


3 


AWCM4159 


#610 


same 


16 


AWC 4159 


#701 


same 


40 


AWCM4161 


#612 


prepro-"glaA2"-KEX2-ScFv-HCG 


4 


AWC 4161 


# 2 


same 


1 



A, niger var. awamori mutant #40 0 



30 The amount of ScFv-LYS in medium of AWC(M)4155 (18 a,a. glaA) transformants 
ranged from 2 to 22 mg/I. AWC(M)4159 (co-)transfonnants (prepro-"glaA2"- 
KEX2-constniction) secrete up to about 90 mg/1 into the medium, while no 
production was found for the A. niger var. awamori mutant #40 strain. 
With the quantitative PIN-ELISA assay for the determination of ScFv-HCG it was 

35 found that AWC(M)4161 (co-)transformants ("glaA2"-KEX2-construction) secreted 
up to 4 mg/1 functional ScFv-HCG into the medium. However, in the medium of 
AWC4157 (18 aa glaA signal sequence) transformants no ScFv-HCG was detected. 
Method : PINs coated with either lysozyme or HCG were incubated with (diluted) 
medium samples. Subsequently the PINs were incubated with antiserum against Fv- 
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LYS and Fv-HCG respectively, then with goat-anti-rabbit conjugate with alkaline 
phosphatase. Finally the alkaline phosphatase enzyme-activity was determined after 
incubation with p-nitro-phenyl phosphate and the optical density was measured at 
405 nm. Using standard solutions of Fv-LYS and Fv-HCG respectively, the amount 
5 of functional ScFv-LYS and ScFv-HCG was calculated. 



Example 3 Construction ot Aspergillus niger van awamori integration vectors 
for the production of ScFv fragments^ using the endoxylanase pro- 
10 moter and terminator and a DNA sequence encoding the endo* 

xylanase secretion signal and the mature endoxylanase protein. 

Although this Example describes the construction of expression plasmids encoding 
fusion proteins between the mature endoxylanase protein and the ScFv fragment it 
is obvious that alternative expression plasmids can be constructed in much the 
15 same way in which only part of the endoxylanase protein is used. 

3.1 Construction of pUR4158-A. 

After digesting plasmid pScFvLYSmyc (see Example 1.1) with Pstl and Xhol, an 
about 0.7 kb PstVXIiol fragment could be isolated from agarose gel. This fragment 
20 codes for a truncated Single Chain Fv-Lys fragment missing the first 5 and the last 
5 amino acids (see the nucleotide sequence (SEQ. ID. NO: 25) and deduced amino 
acid sequence (SEO. ID. NO: 26) of the about 700 bp PstVXhol fragment encoding 
the ScFv fragment of the monoclonal anti-lysozyme antibody D1.3 (ScFv LYS) 
given below. 

25 

Nucleotide sequence and deduced amino acid sequence of ScFv LYS 
Ps^'i. • • • • • 

1 CTGCAGGAGTCAGGACCTGGCCTGGTGGCGCCCTCACAGAGCCTGTCCAT 50 
30 LQESGPGLVAPSQSLSI 

• • • • • 

51 CACATGCACCGTCTCAGGGTTCTCATTAACCGGCTATGGTGTAAACTGGG 100 
TCTVSGFSLTGYGVNW. 
35 > CDR I < 
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TTCGCCAGCCTCCAGGAAAGGGTCTGGAGTGGCTGGGAATGATTTGGGGT 150 
VRQPPGK.GLEWLGMIW G 

> 



GATGGAAACACAGACTATAATTCAGCTCTCAAATCCAGACTGAGCATCAG 200 
DGNTDYNSALKSRLSIS 
CDR II < • 



CAAGGACAACTCCAAGAGCCAAGTTTTCTTAAAAATGAACAGTCTGCACA 250 
KDNSKSQVFL KMNSLH 



CTGATGACACAGCCAGGTACTACTGTGCCAGAGAGAGAGATTATAGGCTT 300 
TDDTARYYCARERDYRL 

> CDR III 

BstEII. 

GACTACTGGGGCCAAGGCACCACGGTCACCGTCTCCTCAGGTGGAGGCGG 350 
DYWGQGTTVTVSSGGGG 
< > 

« » « • ^dol • 

TTCAGGCGGAGGTGGCTCTGGCGGTGGCGGATCGGACATCGAGCTCACTC 400 

SGGGGSGGGGSDIELT 
Linker < > VI 



AGTCTCCAGCCTCCCTTTCTGCGTCTGTGGGAGAAACTGTCACCATCACA 450 
QSPASLSASVGETVTIT 



TGTCGAGCAAGTGGGAATATTCACAATTATTTAGCATGGTATCAGCAGAA 500 
CRASGNIHNYLAWYQQK 
> CDR I < 



ACAGGGAAAATCTCCTCAGCTCCTGGTCTATTATACAACAACCTTAGCAG 550 
QGKSPQLLVYYTTTLA 

> CDR II 



ATGGTGTGCCATCAAGGTTCAGTGGCAGTGGATCAGGAACACAATATTCT 600 
DGVPSRFSGSGSGTQYS 



CTCAAGATCAACAGCCTGCAACCTGAAGATTTTGGGAGTTATTACTGTCA 650 
LKINSLQPEDFGSYYCQ 

> 
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Xhol 

651 ACATTTTTGGAGTACTCCTCGGACGTTCGGTGGAGGCACCAAGCTCGAG 699 
HFWSTPRTFGGGTKLE 
CDR III < 

5 

The multiple cloning site of plasmid pEMBL9 (Dente et al, 1983), ranging from 

the EcoRl to the /fmdIII site, can be replaced by a synthetic DNA fragment having 

the following nucleotide sequence (SEQ. ID. NO: 27-30). 

10 KEX2 Spacer ScFv N-term. 

ISKRGGSQVQLQ* 
AAT TC G ATA TCG AAG CGC GGC GGA TCC CAG GTG CAG CTG CAG TAA - 
GC TAT AG C TTC GCG CCG CCT AGG GTC CAC GTC GAC GTC ATT - 
EcdKL EcoRV BamHI Pstl 

15 

ScFv C-term. 

VTKLEIKR** 

- GTG ACT AAG CTC GAG ATC AAA CGG TGA TAA GOT CG C TTA 

- CAC TGA TTC GAG CTC TAG TTT GCC ACT ATT CGA GCG A AT TCG A 
20 Xhol Aflll Hindlll 

This DNA fragment can be used for replacing the multiple cloning site of plasmid 
pEMBL9 (ranging from the Ecd91 to the HindWl site). The 5*-part of the coding 
strand of the synthetic DNA fragment codes for the KEX2 recognition site (ISKR), 
a spacer (GGS) followed by the first 5 amino acids of the variable part of the 
25 antibody heavy chain. The 3'-part of the coding sequence encodes the last 8 amino 
add residues of the variable part of the antibody light chain. Upon digesting the 
obtained plasmid with Pstl and Xfiol a vector fragment of about 4 kb can be 
isolated. 

Upon ligating the about 0.7 kb Pstl-Xhol fragment of pScFvLYSmyc with the about 
30 4 kb vector fragment, pUR4158-A can be obtained containing the restored genes 
encoding the V„ and Vl antibody fragments. 



32 Construction of pXYL2. 

Plasmid pAW14B was the starting vector for the construction of a series of 
35 expression plasmids containing exlA expression signals and genes coding for ScFv 
fragments. The plasmid comprises an Aspergillus nigervzT. awamori chromosomal 
5.2 kb SaR fragment on which the 0.7 kb exlA gene is located, together with 2.5 kb 
of 5'-flanking sequences and 2.0 kb of 3'-fianking sequences (see Figure 6 = Figure 
3 of UNILEVER'S not prior-published WO 93/12237). 
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Upon digesting pAWHB with Xbal and BamUl, an about 3.2 kb Xbal-BamHl 
fragment can be isolated comprising the exlA promoter, the exM structural gene 
and part of the exL4 terminator area. This fragment can be cloned into plasmid 
pBIuescript (ex Stratagene) digested with the same enzymes, resulting in plasmid 
5 pXYLl. 

By applying PCR technology on the about 3.2 kh Xbal-BamHl fragment, it is 
possible to change the 3 -end of the exL4 structural gene by replacing the last 
codon encoding serine and the stop codon TAA by the BaniHl site GGA TCC 
followed by 8 other codons comprising an EcoRY site and an EcoRJ site using a 

10 first (anti-sense) primer (A) given below (SEQ. ID. NO: 31-34) and a second 
(sense) primer (B) also given below located upstream of the Seal site (located in 
the exlA gene). This sense primer corresponds with nucleotides 824-843 of Figure 1 
of UNILEVER'S not prior-published W) 93/12237 forming part of the exlA gene. 
After digesting the resulting PCR product with Seal and £coRI, an about 175 bp 

15 ScahEcoRl fragment can be isolated. Upon digesting pXYLl with Seal (partially) 
and with EeoRl (partially), an about 6 kb iScaI-£coRI fragment, comprising the 
intact pBIuescript DNA and the exlA promoter region and most of the exlA 
structural gene, can be isolated. 

Ligation of the about 175 bp Scal-EcoRl fragment with the about 6 kb Scal-EcoRl 
20 fragment ex pXYLl will result in a plasmid, called pXYL2, which differs from 
pXYLl in that the 3*-part of the exlA gene and the terminator fragment are 
replaced by the newly obtained ScahEcoRl PGR fragment. 



Oligonucleotides used for changing the 3 -end of the exlA structural gene by means 

25 of PCR technology. 

A. anti-sense primer 

V T I S S * 
5»-T GTC ACG ATC TCC TCT TAA GGGATAAGTGCCTTGGTAGTC-3 ' 

I III Ml 1 I 1 III 
I Ml Ml I M I M 

30 3* -A CA6 TGC TAG AGG 

\gsanvisn st 
CCTAGG CGATTAC AGTATAGCTTAAG CTGA-5 ' 

SajziHI EcoRV EcoRl 

N.B, The PCR oligonucleotide is bold-printed; the corresponding amino acids 
35 are given in small print. 
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B. sense primer (20-oligonier) 

5'-GA ACT AAC GAA CCG TCC ATC-3' (SEQ. ID. NO: 35) 



5 33 Construction of pUR4455 and pUR4456 

Starting from pAW14B, pAW14B-10 was constructed by removing the EcoKl site 
originating from the pUC19 polylinker and introducing a Notl site. 
This was achieved by partially digesting plasmid pAW14B with EcoHI and after 
dephosphorylation the linear 7.9 kb EcoRl plasmids were isolated and religated in 
10 the presence of the "EcoRY-Notl linker: 

5'-AAT rGCGGCCGC -3' (SEQ. ID. NO: 36). 

Notl 

After selecting a plasmid still containing the EcoRl site in the upstream area of the 
15 exlA structural gene, pAW14B-10 was obtained. Such selection method is known to 
a skilled person. 

Subsequently the Aflll site, located downstream of the exlA terminator was 
removed by partially cleaving plasmid pAW14B-10 with AflU and religating the 
isolated, linearized plasmid after filling in the sticky ends, resulting in plasmid 

20 pAW14B-ll after selecting the plasmid still containing the Aflll site near the stop 
codon of the exlA gene. Such selection method is known to a skilled person. 
This plasmid pAW14B-ll can be used for construction of a series of expression 
plasmids comprising a DNA fragment coding for a fusion protein consisting of the 
endoxylanase protein or part thereof and the ScFv fragment. Preferably the two 

25 protein fragments are connected by a protease recognition site e.g the KEX2 
cleavage site. 

(i) Upon digesting plasmid pAW14B-ll with Notl and Aflll, an about 4,7 kb 
fragment can be isolated comprising the pUC19 vector and part of the exlA 
terminator. 

30 (ii) Upon digestion of pXYL2 with Notl and EcoRV, an about 3.2 kb 

fragment can be isolated. Alternatively an Notl-Ba?7tHl fragment of about the same 
length can be isolated. 
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(iii) Upon digesting pUR4158-A with EcoRV and Aflll, an about 0.8 kb 
fragment can be isolated encoding the ScFv-LYS preceded by a short (linker) 
peptide comprising the KEX2 cleavage site and a spacer (GGS). Alternatively, a 
BamlU'AflU fragment of about the same length can be isolated, which fragment 

5 does not contain a DNA fragment encoding the KEX2 cleaving site. 

A) For the construction of expression plasmids encoding the fusion protein 
consisting of mature endoxylanase and ScFv-LYS, the about 4.7 kb NothAflU of 
pAW14B-ll, the about 3.2 kb Notl-Bamlil fragment of pXYL2 and the about 0.75 
kb BamUl-Afla fragment of pUR4158-A are ligated resulting in pUR4455. 

10 B) For the construction of expression plasmids encoding the fusion protein 
consisting of mature endoxylanase and ScFv-LYS connected by the KEX2 cleavage 
site, the about 4.7 kb NothAflll of pAW14B-ll, the about 3.2 kb NothEcoRW 
fragment of pXYL2 and the about 0.75 kb EcoRV-Aflll fragment of pUR4158-A 
are ligated resulting in pUR4456. 

15 

The constructed expression vectors can subsequently be transferred to moulds (for 
example Aspergillus niger, Aspergillus niger var. awamori, Aspergillus nidulam etc) by 
means of conventional co-transformation techniques and the chimeric gene com- 
prising a DNA sequence encoding the desired ScFv fragment can then be 

20 expressed via induction of the endoxylanase 11 promoter. The constructed vector 
can also be provided with conventional selection markers (e.g. amdS or pyrG, 
hygromycin etc.), e.g. by introducing the corresponding genes into the unique Notl 
restriction site, and the mould can be transformed with the resulting vector to 
produce the desired protein, essentially as described in Example 2 of 

25 UNILEVER'S not prior-published WO 93/12237. 

Example 4 Isolation of gene fragments of antibodies raised against (oral) 
micFOorganisms. 

30 Monoclonal antibodies raised against oral microorganisms have been described in 
the literature (De Soet et al\ 1990), an example of which is OMVUlO raised 
against streptococci. For the production of ScFv fragments derived from these 
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monoclonal antibodies the gene fragments encoding the variable regions of the 
heavy and light chains had to be isolated. The isolation of RNA from the 
hybridoma cell lines, the preparation of cDNA and amplification of gene fragments 
encoding the variable regions of antibodies by PCR were performed according to 
5 standard procedures known from the literature (see for example Orlandi et al, 
1989). For the PCR amplification different oligonucleotide primers have been 
used, 

for the heavy chain fragment: 

A: 5*-AGG TSM A RC TGC AG S AGT CWG G-3' (SEQ. ID. NO: 37) 
10 Pstl 

in which S is C or G, M is A or C, R is A or G, and W is A or T 

and 

B: 5*-TGA GGA GAC GGT GAC C GT GGT CCC TTG GCC CC-3* 

B5/EII (SEQ. ID. NO: 38), 

15 and for the light chain fragment (Kappa): 

C: 5-*GAC ATT GAG CTC ACC CAG TCT CCA-3' (SEQ. ID. NO: 39) 

Sacl 

and 

D: y-GTT TGA TCT CGA GC T TGG TCC C-3' (SEQ. ID. NO: 40). 
20 Xliol 

The heavy chain PCR fragment obtained in this way was digested with Pstl and 

BstEll and a PsthBstEU fragment of about 0.33 kb was isolated. The thus obtained 

fragment can be cloned into pUR4158-A. To this end pUR4158-A is digested with 

Pstl and BstEllj after which an about 4.4 kb vector fragment can be isolated. 

25 Ligation of the above described heavy chain firagment of OMVUlO with the about 

4.4 kb vector fi-agment will result in pUR4158-A10H. In this plasmid the heavy 

chain fragment of the lyso^m antibody, which was originally present, is replaced 

by that of the OMVUlO antibody. 

The light chain PCR fragment obtained in a similar way was digested with Sacl 
30 and Xliol, and a SachXlwl fragment of about 0.3 kb was isolated. After digestion 
of pUR4158-A10H with Sacl and Xfiol, a vector fragment of about 4.4 kb can be 
isolated. Ligation of this vector fragment with the above described light chain 
fragment of OMVUlO will result in pUR4457. In this plasmid both the heavy chain 
fragment and the light chain fragment of the lysozyme antibody are replaced by the 
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appropriate heavy and light chain fragments of OMVUIO. The nucleotide sequence 
(SEQ. ID. NO: 41) and the deduced amino acid sequence (SEQ. ID. NO: 42) of 
the PstJ-Xhol fragment present in pUR4457 containing the thus obtained gene 
encoding an ScFv fragment of OMVUIO is given below. The fijst 5 codons and the 
5 last 5 codons are given in Example 3.1 above showing the overlap with the Pstl and 
Xliol sites. 



Nucleotide sequence and deduced amino acid sequence of ScFv OMVUIO 

Pstl . . • • • 

10 1 CTGCAG GAGTCAGGGGGAGGCTTAGTGCAGCCTGGAGGGTCCCGGAAACT 50 
LQESG'GGLVQPGGSRKL 

5 1 CTCCTGTGCAGCCTCTGGATTCACTTTCAGTAACTTTGGAATGCACTGGG 100 
15 S CAASGFTFSNFGMHW 

> CDR I < 



101 TTCGTCAGGCTCCAGAGAAGGGGCTGGAGTGGGTCGCATACATTAGTAGT 150 
20 VRQAPEKGLEWVAYISS 

> 



151 6GCGGTACTACCATCTACTATTCAGACACAATGAAGGGCCGATTCACCAT 200 
25 GGTTIYYSDTMKGRFTI 

CDR II < 

. • ■ • • 

201 CTCCAGAGACAATCCCAAGAACACCCTGTTCCTGCAAATGACCAGTCTAA 250 
30 SRDNPKNTLFLQMTSL 

• • • 

251 GGTCTGAGGACACGGCCATGTATTTCTGTGCAAGATCCTGGGCCTATGCT 300 
RSEDTAMYFCARSWAYA 
35 > CDR III 

BstEII 

301 ATGGACTACTGGGGCCAAGGGACCACGGTCACCGTCTCCTCAGGTGGAGG 350 
MDYWGQGTTVTVSSGGG 

40 < > 

Sad 

351 CGGTTCAGGCGGAGGTGGCTCTGGCGGTGGCGGATCGGACATCGAGCTCA 400 
GSGGGGSGGGGSDIEL 
45 Linker < > VI 



50 



401 CCCAGTCTCCATCTTATCTTGCTGCATCTCCTGGAGAAATCATTACTATT 450 
TQSPSYLAASPGEIITI 
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10 



15 



451 AATTGCAGGGCAAGTAAGAGTATTAGCAAATATTTAGCCTGGTATCAAGA 500 
NCRASKSISKYLAWYQE 
> CDR I . < . 



501 GAAACCTGGAAAAACAAATAAGCTTCTTATCTACTCTGGATCCATTTTGC 550 
KPGKTNKLLIYSGSI L 

> CDR II 



551 AATCTGGAATTCCATCAAGGTTCAGTGGCAGTGGATCTGGTACAGATTTC 600 
QSGIPSRFSGSGSGTDF 
< 



601 ACTCTCACCATCAGTAGCCTGGAGCCTGAAGATTTTGCAATGTATTACTG 650 
TLTISSLEPEDF AMYYC 



20 .... Xhol 

651 TCAACAGCATAATGAATACCCGTGGACGTTCGGTGGAGGGACCAAGCTCGAG 702 
QQHNEYPWTFGGGTKLE 
> CDR III < 



25 



Example 5 Construction of an expression cassette for the production of an 

OMVUlO ScFv fragment. 
After digesting pUR4457 (see Example 4) with EcoRV and Aflll, an about 0.8 kb 
30 fragment can be isolated encoding the ScFv-OMVUlO preceded by a short (linker) 
peptide comprising the KEX2 cleavage site and the GGS spacer. Alternatively, a 
BamHl-Aflll fragment of about 0.75 kb can be isolated for the construction of 
expression plasmids coding for fusion proteins not containing a KEX2 cleavage 
site. 

35 Upon ligating the thus obtained fragments with the fragments obtained in 3.3 (i) 
and (ii) in the same way as described in 3.3 B) and A), an expression plasmid can 
be obtained containing a DNA sequence coding for a fusion protein comprising the 
endoxylanase protein and the ScFv OMVUlO fragment, either with (pUR4460) or 
without (pUR4459) the KEX2 cleavage site, respectively. 

40 Analogous to the method described in Example 3, the resulting plasmids (either 
with or without an added selection marker) can be introduced into Aspergillus. 
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Example 6 Isolation of gene fragments of an antibody raised against human 

pregnancy hormone (HCG). 
In much the same way as described in Example 4, gene fragments coding for the 
variable regions of the heavy and the light chains of anti-HCG antibodies were 
5 isolated and can be cloned into plasmid pUR4158-A which results in plasmid 
pUR4458. The nucleotide sequence (SEQ. ID. NO: 7) and the deduced amino acid 
sequence (SEQ. ID. NO: 8) of the PsthXIiol fragment encoding the ScFv-HCG 
fragment were given above in Example 1.2. 

10 

Example 7 Construction of expression cassettes for the production of ScFv 

fragments, using the endoxylanase promoter and terminator and a 
DNA sequence encoding the prepro-"glaA2" protein. 
7,1 Construction of pAW14B-12. 

15 Plasmid pAW14B-12 was constructed using pAW14B-ll (see Example 3.3) as 

starting material. After digestion of pAW14B-ll mihAflU (located at the exlA stop 
codon) and BgHl (located in the exL4 promoter) the 2.4 kb Aflll-BgUl fragment, 
containing part of the exlA promoter and the exlA gene was isolated. 
After partial digestion of this fragment with BspHl (located in the exlA promoter 

20 and the exIA start codon) the isolated 1.8 kb BglU-Bsplil exlA promoter fragment 

(up to the ATG) was ligated with the isolated 5.5 kb AflU-Bglll fragment of 

pAW14B-ll, containing the exIA terminator, in the presence of the synthetic DNA 

oligonucleotides: 

(BspUl) Aflll 
25 5'- CAT GCA GTC TTC GGG C -3' (SEQ. ID. NO: 43) 

3«- GT CAG AAG CCC GAA TT -5' (SEQ. ID. NO: 44) 

Bbsl 

resulting in pAW14B-12. 

30 7.2 Assembly of expression cassettes 

(i) Upon digesting pAW14B-12 with Bbsl (partially) and Aflll, an about 7.3 
kb BspHl-Afln vector fragment was isolated. 
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(ii) From plasmid pAN56-4 (described in the above mentioned reference of 
M.P. Broekhuijsen et al) an about 1.9 kb NcoI-EcoRV fragment was isolated, 
comprising part of the gloA gene, starting from the ATG initiation codon (which 
coincides with the Ncol site), and coding for the glucoamylase prepro part and the 

5 first 514 amino acids of the mature glucoamylase ("glaA2"). 

(iii) From the plasmids pUR4158-A (encoding for the ScFv-LYS fragment 
preceded by the KEX2 recognition site and the GGS spacer: see Example 3.1), 
pUR4457 (encoding for the ScFv-OMVUlO fragment preceded by the KEX2 
recognition site and the GGS spacer: see Example 4), and pUR4458 (encoding for 

10 the ScFv-HCG fragment preceded by the KEX2 recognition site and the GGS 
spacer: see Example 6) EcoRV-AflU fragments of about 0.8 kb were isolated. 

Upon ligating (i) the BspUl-Aflll vector fragment, (ii) the NcoVEcoRV gloA 
fragment (Ncol sticky ends are compatible with BspHl sticky ends), and either of 
15 the EcoKV'AfUl ScFv encoding fragments, a set of expression plasmids can be 
obtained. 

pUR4462 ?€xIA - prepro-"glaA2"-KEX2.ScFv-LYS 
pUR4463 ?exL4 - prepro-"glaA2"-KEX2-ScFv.HCG 
pUR4464 TexlA - prepro-"glaA2"-KEX2-ScFv-OMVU10 
20 After insertion of the amdS selection marker into the Notl site, the resulting 
plasmids were introduced into Aspergillus, as described in Example 3. 

73 Pioduction of ScFv-LYS 

Upon growth of the resulting Aspergillus niger van awamori transformed with 
25 pUR4462 in a 10 litre fermenter, the culture medium was analyzed by 

polyacrylamide gel electrophoresis. Figure 7 shows the gel after it was stained with 
Coomassie Brilliant Blue and with arrows are indicated the released ScFv-LYS 
fragment and the fusion protein and/or the truncated glaA protein. 
The amount of "active" ScFv-LYS was determined to be about 250 mg/1. 
30 It is obvious that further optimization of the fermentation conditions or 

mutagenesis of the production strain will result in even higher production levels. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 
(i) APPLICANT: 



(A 
(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 

(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 
(C 

(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 

(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 



NAME: UNILEVER N.V. 
STREET: Weena 455 
CITY : Rotterdam 
COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-3013 AL 

NAME: UNILEVER PLC 

STREET: Unilever House Blackfriars 

CITY: London 

COUNTRY: United Kingdom 

POSTAL CODE (ZIP): EC4P 4BQ 

NAME: NEDERLANDSE ORGANISATIE VOOR TOEGEPAST- 

NATUURWETENSCHAPPELIJK ONDERZOEK TNO 
STREET: Schoemakersstraat 97 
CITY: Delft 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-2628 VK 

NAME: Leon Gerardus Joseph FRENKEN 
STREET: Geldersestraat 90 
CITY: Rotterdam 
COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-3011 MP 

NAME: Robert F.M. van GORCOM 
STREET: Liber iastraat 7 
CITY: Delft 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-2622 DE 

NAME: Johanna G.M. HESSING 

STREET: Adema van Scheltemaplein 38 

CITY: Delft 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-2624 PG 

NAME: Cornelis Antonius M.J.J, van den HONDEL 
STREET: Water lelie 124 
CITY: Gouda 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP) : NL-2804 PZ 

NAME: Wouter MUSTERS 
STREET: Hipper spark 138 
CITY: Maassluis 
COUNTRY: The Netherlands 
POSTAL CODE (ZIP) : NL-3141 RD 

NAME: Johannes Maria A. VERBAKEL 
STREET: Inge land 9 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 



38 



PCT/EP94/01906 



(C) CITY: Maasland 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP) : NL-3155 GC 

(A) NAME: Cornel is Theodorus VERRIPS 

(B) STREET: Hagedoorn 18 

(C) CITY: Maassluis 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-3142 KB 



(ii) TITLE OF INVENTION: 

Process for producing fusion proteins comprising 
ScFv fragments by a transformed mould 

(iii) NUMBER OF SEQUENCES: 45 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, 

Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..855 
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(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: l.,855 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

48 

AAG CTT GCA TGC AAA TTC TAT TTC AAG GAG ACA GTC ATA ATG AAA TAC 
Lys Leu Ala Cys Lys Phe Tyr Phe Lys Glu Thr Val lie Met Lys Tyr 
15 10 15 

96 

CTA TTG CCT ACG GCA GCC GCT GGA TTG TTA TTA CTC GCT GCC CAA CCA 
Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala Ala Gin Pro 
20 25 30 

144 

GCG ATG GCC CAG GTG CAG CTG CAG GAG TCA GGA CCT GGC CTG GTG GCG 
Ala Met Ala Gin Val Gin Leu Gin Glu Ser Gly Pro Gly Leu Val Ala 
35 40 45 

192 

CCC TCA CAG AGC CTG TCC ATC ACA TGC ACC GTC TCA GGG TTC TCA TTA 
Pro Ser Gin Ser Leu Ser lie Thr Cys Thr Val Ser Gly Phe Ser Leu 
50 55 60 

240 

ACC GGC TAT GGT GTA AAC TGG GTT CGC CAG CCT CCA GGA AAG GGT CTG 
Thr Gly Tyr Gly Val Asn Trp Val Arg Gin Pro Pro Gly Lys Gly Leu 
65 70 75 80 

288 

GAG TGG CTG GGA ATG ATT TGG GGT GAT GGA AAC ACA GAC TAT AAT TCA 
Glu Trp Leu Gly Met lie Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser 
85 90 95 

336 

GCT CTC AAA TCC AGA CTG AGC ATC AGC AAG GAC AAC TCC AAG AGC CAA 
Ala Leu Lys Ser Arg Leu Ser He Ser Lys Asp Asn Ser Lys Ser Gin 
100 105 110 

384 

GTT TTC TTA AAA ATG AAC AGT CTG CAC ACT GAT GAC ACA GCC AGG TAC 
Val Phe Leu Lys Met Asn Ser Leu His Thr Asp Asp Thr Ala Arg Tyr 
115 120 125 

432 

TAC TGT GCC AGA GAG AGA GAT TAT AGG CTT GAC TAC TGG GGC CJVA GGC 
Tyr Cys Ala Arg Glu Arg Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly 
130 135 140 

480 

ACC ACG GTC ACC GTC TCC TCA GGT GGA GGC GGT TCA GGC GGA GGT GGC 
Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
145 150 155 160 

528 

TCT GGC GGT GGC GGA TCG GAC ATC GAG CTC ACT CAG TCT CCA GCC TCC 
Ser Gly Gly Gly Gly Ser Asp He Glu Leu Thr Gin Ser Pro Ala Ser 
165 170 175 

576 

CTT TCT GCG TCT GTG GGA GAA ACT GTC ACC ATC ACA TGT CGA GCA AGT 
Leu Ser Ala Ser Val Gly Glu Thr Val Thr He Thr Cys Arg Ala Ser 
180 185 190 
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624 



GGG 


AAT 


ATT 


CAC 


AAT 


TAT 


TTA 


GCA 


TGG TAT 


CAG CAG 


AAA 


CAG 


GGA 


AAA 


Gly 


Asn 


He 


His 


Asn 


Tyr 


Leu 


Ala 


Trp Tyr 


Gin Gin 


Lys 


Gin 


Gly 


Lys 




195 










200 






205 






672 


TCT 


OCT 


CAG 


CTC 


CTG 


GTC 


TAT 


TAT 


ACA ACA 


ACC TTA 


GCA 


GAT 


GGT 


GTG 


Ser 


Pro 
210 


Gin 


Leu 


Leu 


Val 


Tyr 
215 


Tyr 


Thr Thr 


Thr Leu 
220 


Ala 


Asp 


Gly 


Val 
720 


CCA 


TCA 


AGG 


TTC 


AGT 


GGC 


AGT 


GGA 


TCA GGA 


ACA CAA 


TAT 


TCT 


CTC 


AAG 


Pro 


Ser 


Arg 


Phe 


Ser 


Gly 


Ser 


Gly 


Ser Gly 


Thr Gin Tyr 


Ser 


Leu 


Lys 


225 








230 








235 








240 
768 


ATC 


AAC 


AGC 


CTG 


CAA 


CCT 


GAA 


GAT 


TTT GGG 


AGT TAT 


TAC 


TGT 


CAA 


CAT 


He 


Asn 


Ser 


Leu 


Gin 
245 


Pro 


Glu 


Asp 


Phe Gly 
250 


Ser Tyr 


Tyr 


Cys 


Gin 
255 


His 
816 


TTT 


TGG 


AGT 


ACT 


CCT 


CGG 


ACG 


TTC 


GGT GGA 


GGC ACC 


AAG 


CTC 


GAG 


ATC 


Phe 


Trp 


Ser 


Thr 


Pro 


Arg 


Thr 


Phe 


Gly Gly 


Gly Thr 


Lys 


Leu 


Glu 


He 






260 










265 






270 




865 


AAA 


CGG 


GAA 


CAA 


AAA 


CTC 


ATC 


TCA 


GAA GAG 


GAT CTG 


AAT 


TAATAATGAT 


Lys 


Arg 


GlU 


Gin 


Lys 


Leu 


He 


Ser 


Glu Glu 


Asp Leu 


Asn 








275 








280 






285 









C7VAACGGTAA TAAGGATCCA GCTCGAATTC 895 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 285 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



Lys Leu Ala Cys Lys Phe Tyr Phe Lys Glu Thr Val He Met Lys Tyr 
15 10 15 

Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala Ala Gin Pro 
20 25 30 

Ala Met Ala Gin Val Gin Leu Gin Glu Ser Gly Pro Gly Leu Val Ala 
35 40 45 

Pro Ser Gin Ser Leu Ser. He Thr Cys Thr Val Ser Gly Phe Ser Leu 
50 55 60 

Thr Gly Tyr Gly Val Asn Trp Val Arg Gin Pro Pro Gly Lys Gly Leu 
65 70 75 80 
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Glu Trp Leu Gly Met lie Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser 
85 90 95 

Ala Leu Lys Ser Arg Leu Ser lie Ser Lys Asp Asn Ser Lys Ser Gin 
100 105 110 

Val Phe Leu Lys Met Asn Ser Leu His Thr Asp Asp Thr Ala Arg Tyr 
115 120 125 

Tyr Cys Ala Arg Glu Arg Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly 
130 135 140 

Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
145 150 155 160 

Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser Pro Ala Ser 
165 170 175 

Leu Ser Ala Ser Val Gly Glu Thr Val Thr lie Thr Cys Arg Ala Ser 
180 185 190 

Gly Asn lie His Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Gin Gly Lys 
195 200 205 

Ser Pro Gin Leu Leu Val Tyr Tyr Thr Thr Thr Leu Ala Asp Gly Val 
210 215 220 

Pro Ser Arg Phe Ser Gly Ser Gly Ser Gly Thr Gin Tyr Ser Leu Lys 
225 230 235 240 

lie Asn Ser Leu Gin Pro Glu Asp Phe Gly Ser Tyr Tyr Cys Gin His 
245 250 255 

Phe Trp Ser Thr Pro Arg Thr Phe Gly Gly Gly Thr Lys Leu Glu lie 
260 265 270 

Lys Arg Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Asn 
275 280 285 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TCGAGATCAA ACGGTAATGA G 
21 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AATTCTCATT ACCGTTTGAT C 
21 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Glu lie Lys Arg 
1 



(2) INFORMATION FOR SEQ ID NO: 7:' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..717 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

48 

CTG CAG GAG TCT GGG GGA CAC TTA GTG AAG CCT GGA GGG TCC CTG AAA 
Leu Gin Glu Ser Gly Gly His Leu Val Lys Pro Gly Gly Ser Leu Lys 
15 10 15 
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CTC TCC TGT GCA GCC TCT GGA TTC GCT TTC AGT AGC TTT 
Leu Ser Cys Ala Ala Ser Gly Phe Ala Phe Ser Ser Phe 
20 25 

TGG ATT CGC CAG ACT CCG GAG AAG AGG CTG GAG TGG GTC 
Trp lie Arg Gin Thr Pro Glu Lys Arg Leu Glu Trp Val 
35 40 45 

ACT AAT GTT GGT ACT TAG ACC TAG TAT CCA GGC AGT GTG 
Thr Asn Val Gly Thr Tyr Thr Tyr Tyr Pro Gly Ser Val 
50 55 60 

TTC TCC ATC TCC AGA GAC AAT GCC AGG AAC ACC CTA AAC 
Phe Ser lie Ser Arg Asp Asn Ala Arg Asn Thr Leu Asn 
65 70 75 

AGC AGT CTG AGG TCT GAG GAC ACG GCC TTG TAT TTC TGT 
Ser Ser Leu Arg Ser Glu Asp Thr Ala Leu Tyr Phe Cys 
85 90 

GGG ACT GCG GCA CAA CCT TAC TGG TAC TTC GAT GTC TGG 
Gly Thr Ala Ala Gin Pro Tyr Trp Tyr Phe Asp Val Trp 
100 105 

ACC ACG GTC ACC GTC TCC TCA GGT GGA GGC GGT TCA GGC 
Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly 
115 120 125 

TCT GGC GGT GGC GGA TCG GAC ATC GAG CTC ACC CAG TCT 
Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser 

130 135 140 

ATG TCC ATG TCC GTA GGA GAG AGG GTC ACC TTG AGC TGC 
Met Ser Met Ser Val Gly Glu Arg Val Thr Leu Ser Cys 
145 150 155 

GAG ACT GTG GAT TCT TTT GTG TCC TGG TAT CAA CAG AAA 
Glu Thr Val Asp Ser Phe Val Ser Trp Tyr Gin Gin Lys 
165 170 

TCT CCT AAA TTG TTG ATA TTC GGG GCA TCC AAC CGG TTC 
Ser Pro Lys Leu Leu lie Phe Gly Ala Ser Asn Arg Phe 
180 185 

CCC GAT CGC TTC ACT GGC AGT GGA TCT GCA ACA GAC TTC 
Pro Asp Arg Phe Thr Gly Ser Gly Ser Ala Thr Asp Phe 
195 200 205 

ATC AGC AGT GTG CAG GCT GAG GAC TTT GCG GAT TAC CAC 
lie Ser Ser Val Gin Ala Glu Asp Phe Ala Asp Tyr His 
210 215 220 

ACT TAC AAT CAT CCG TAT ACG TTC GGA GGG GGG ACC AAG 
Thr Tyr Asn His Pro Tyr Thr Phe Gly Gly Gly Thr Lys 
225 230 235 



96 

GAC ATG TCT 
Asp Met Ser 
30 

144 

GCA AGC ATT 
Ala Ser lie 

192 

AAG GGC CGA 
Lys Gly Arg 



CTG CAA 
Leu Gin 



GCA AGA 
Ala Arg 
95 

GGC CAA 
Gly Gin 
110 

GGA GGT 
Gly Gly 



240 
ATG 
Met 

80 

288 
CAG 
Gin 

336 
GGG 
Gly 

384 
GGC 
Gly 



432 
CCA AAA TCC 
Pro Lys Ser 



AAG 
Lys 



CCA 
Pro 



AGT 
Ser 
190 

ACT 
Thr 



GCC 
Ala 



GAA 
Glu 
175 

GGG 
Gly 



480 
AGT 
Ser 
160 

528 
CAG 
Gin 

576 
GTC 
Val 



624 

CTG ACC 
Leu Thr 



672 

TGT GGA CAG 
Cys Gly Gin 



717 



CTC GAG 
Leu Glu 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu Gin Glu Ser Gly Gly His Leu Val Lys Pro Gly Gly Ser Leu Lys 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe Ala Phe Ser Ser Phe Asp Met Ser 
20 25 30 

Trp lie Arg Gin Thr Pro Glu Lys Arg Leu Glu Trp Val Ala Ser lie 
35 40 45 

Thr Asn Val Gly Thr Tyr Thr Tyr Tyr Pro Gly Ser Val Lys Gly Arg 
50 55 60 

Phe Ser lie Ser Arg Asp Asn Ala Arg Asn Thr Leu Asn Leu Gin Met 
65 70 75 80 

Ser Ser Leu Arg Ser Glu Asp Thr Ala Leu Tyr Phe Cys Ala Arg Gin 
85 90 95 

Gly Thr Ala Ala Gin Pro Tyr Trp Tyr Phe Asp Val Trp Gly Gin Gly 
100 105 110 

Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 120 125 

Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser Pro Lys Ser 
130 135 140 

Met Ser Met Ser Val Gly Glu Arg Val Thr Leu Ser Cys Lys Ala Ser 
145 150 155 160 

Glu Thr Val Asp Ser Phe Val Ser Trp Tyr Gin Gin Lys Pro Glu Gin 
165 170 175 

Ser Pro Lys Leu Leu He Phe Gly Ala Ser Asn Arg Phe Ser Gly Val 
180 185 190 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Ala Thr Asp Phe Thr Leu Thr 
195 200 205 

He Ser Ser Val Gin Ala Glu Asp Phe Ala Asp Tyr His Cys Gly Gin 
210 215 220 

Thr Tyr Asn His Pro Tyr Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 235 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

AATTCCATGG GCTTCCGATC TCTACTCGCC CTGAGCGGCC TCGTCTGCAC 50 
AGGGTTGGCA CAGGTGCAGC TGCAGTAAGT GACTAAGCTC GAGATCAAAC 100 
GGTGATA 107 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGCTTATCAC CGTTTGATCT CGAGCTTAGT CACTTACTGC AGCTGCACCT 50 
GTGCCAACCC TGTGCAGACG AGGCCGCTCA GGGCGAGTAG AGATCGGAAG 100 
CCCATGG 107 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Gly Phe Arg Ser Leu Leu Ala Leu Ser Gly Leu Val Cys Thr 
15 10 15 

Gly Leu Ala Gin Val Gin Leu Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Thr Lys Leu Glu lie Lys Arg 
1 5 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GCTTCCTCCC TTTTAGACGC AACTGAGAGC CTGAGGTTCA TCCCCAGCAT 
CATTACACCT GAGC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CATGGCTGAG GTGTAATGAT GGTGGGGATG AAGCTCAGGC TCTCAGTTGC 50 
GTCTAAAAGG GAGGAAGC 68 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 



50 
64 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CATGGCCGAT ATCGCAAGCT TCCG 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GATCCGGAAG CTTGCGATAT CGGC 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
AATTCGATAT CGAAGCGCGG CGGATCCCAG GTGCAGCTGC A 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GCTGCACCTG GGATCCGCCG CGCTTCGATA TCG 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

lie Ser Lys Arg Gly Gly Ser Gin Val Gin Leu Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY:' linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
AATTCGATAT CGATCGAAGG TCGAGGCGGA TCCCAGGTGC AGCTGCAG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GCTGCACCTG GGATCCGCCT CGACCTTCGA TCGATATCG 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

lie Ser lie Glu Gly Arg Gly Gly Ser Gin Val Gin Leu Gin 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GGCC6CTGTG CAG 13 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
AATTCTGCAC AGC 13 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 699 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..699 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

48 

CTG CAG GAG TCA GGA CCT GGC CTG GTG GCG CCC TCA CAG AGC CTG TCC 
Leu Gin Glu Ser Gly Pro Gly Leu Val Ala Pro Ser Gin Ser Leu Ser 
15 10 15 

96 

ATC ACA TGC ACC GTC TCA GGG TTC TCA TTA ACC GGC TAT GGT GTA AAC 
lie Thr Cys Thr Val Ser Gly Phe Ser Leu Thr Gly Tyr Gly Val Asn 
20 25 30 
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TGG GTT CGC 
Trp Val Arg 
35 

TGG GGT GAT 
Trp Gly Asp 
50 

AGC ATC AGC 
Ser lie Ser 
65 

AGT CTG CAC 
Ser Leu His 



GAT TAT AGO 
Asp Tyr Arg 



TCA GGT GGA 
Ser Gly Gly 
115 

GAC ATC GAG 
Asp lie Glu 
130 

GAA ACT GTC 
Glu Thr Val 

145 

TTA GCA TGG 
Leu Ala Trp 



TAT TAT ACA 
Tyr Tyr Thr 



AGT GGA TCA 
Ser Gly Ser 
195 

GAA GAT TTT 
Glu Asp Phe 
210 

ACG TTC GGT 
Thr Phe Gly 
225 



CAG CCT CCA GGA AAG GGT CTG GAG TGG 
Gin Pro Pro Gly Lys Gly Leu Glu Trp 
40 

GGA AAC ACA GAC TAT AAT TCA GCT CTC 
Gly Asn Thr Asp Tyr Asn Ser Ala Leu 
55 60 

AAG GAC AAC TCC AAG AGC CAA GTT TTC 
Lys Asp Asn Ser Lys Ser Gin Val Phe 
70 75 

ACT GAT GAC ACA GCC AGG TAC TAC TGT 
Thr Asp Asp Thr Ala Arg Tyr Tyr Cys 
85 90 

CTT GAC TAC TGG GGC CAA GGC ACC ACG 
Leu Asp Tyr Trp Gly Gin Gly Thr Thr 
100 105 

GGC GGT TCA GGC GGA GGT GGC TCT GGC 
Gly Gly Ser Gly Gly Gly Gly Ser Gly 
120 

CTC ACT CAG TCT CCA GCC TCC CTT TCT 
Leu Thr Gin Ser Pro Ala Ser Leu Ser 
135 140 

ACC ATC ACA TGT CGA GCA AGT GGG AAT 
Thr lie Thr Cys Arg Ala Ser Gly Asn 
150 155 

TAT CAG CAG AAA CAG GGA AAA TCT CCT 
Tyr Gin Gin Lys Gin Gly Lys Ser Pro 
165 170 

ACA ACC TTA GCA GAT GGT GTG CCA TCA 
Thr Thr Leu Ala Asp Gly Val Pro Ser 
180 185 

GGA ACA CAA TAT TCT CTC AAG ATC AAC 
Gly Thr Gin Tyr Ser Leu Lys lie Asn 
200 

GGG AGT TAT TAC TGT CAA CAT TTT TGG 
Gly Ser Tyr Tyr Cys Gin His Phe Trp 
215 220 

GGA GGC ACC AAG CTC GAG 
Gly Gly Thr Lys Leu Glu 
230 



144 

CTG GGA ATG ATT 
Leu Gly Met lie 
45 

192 

AAA TCC AGA CTG 
Lys Ser Arg Leu 



TTA AAA 
Leu Lys 

GCC AGA 
Ala Arg 



GTC ACC 
Val Thr 
110 

GGT GGC 
Gly Gly 
125 

GCG TCT 
Ala Ser 



ATG 
Met 



GAG 
Glu 
95 

GTC 
Val 



240 
AAC 
Asn 

80 

288 
AGA 
Arg 

336 
TCC 
Ser 



384 
GGA TCG 
Gly Ser 

432 
GTG GGA 
Val Gly 



ATT 
He 



CAG 
Gin 



AGG 
Arg 



AGC 
Ser 
205 

AGT 
Ser 



CAC 
His 



AAT 
Asn 



CTC 
Leu 



TTC 
Phe 
190 

CTG 
Leu 



CTG 
Leu 
175 

AGT 
Ser 



480 
TAT 
Tyr 
160 

528 
GTC 
Val 

576 
GGC 
Gly 



624 
CAA CCT 
Gin Pro 



672 

ACT CCT CGG 
Thr Pro Arg 

699 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Leu Gin Glu Ser Gly Pro Gly Leu Val Ala Pro Ser Gin Ser Leu Ser 
15 10 15 

lie Thr Cys Thr Val Ser Gly Phe Ser Leu Thr Gly Tyr Gly Val Asn 
20 25 30 

Trp Val Arg Gin Pro Pro Gly Lys Gly Leu Glu Trp Leu Gly Met lie 
35 40 45 

Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser Ala Leu Lys Ser Arg Leu 
50 55 60 

Ser lie Ser Lys Asp Asn Ser Lys Ser Gin Val Phe Leu Lys Met Asn 
65 70 75 80 

Ser Leu His Thr Asp Asp Thr Ala Arg Tyr Tyr Cys Ala Arg Glu Arg 
85 90 95 

Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val Ser 
100 105 110 

Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
115 120 125 

Asp lie Glu Leu Thr Gin Ser Pro Ala Ser Leu Ser Ala Ser Val Gly 
130 135 140 

Glu Thr Val Thr lie Thr Cys Arg Ala Ser Gly Asn lie His Asn Tyr 
145 150 155 160 



Leu Ala Trp Tyr Gin Gin Lys Gin Gly Lys Ser Pro Gin Leu Leu Val 
165 170 175 

Tyr Tyr Thr Thr Thr Leu Ala Asp Gly Val Pro Ser Arg Phe Ser Gly 
180 185 190 

Ser Gly Ser Gly Thr Gin Tyr Ser Leu Lys lie Asn Ser Leu Gin Pro 
195 200 205 

Glu Asp Phe Gly Ser Tyr Tyr Cys Gin His Phe Trp Ser Thr Pro Arg 
210 215 220 

Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AATTCGATAT CGAAGCGCGG CGGATCCCAG GTGCAGCTGC AGTAAGTGAC 50 
TAAGCTCGAG ATCAAACGGT GATAAGCTCG CTTA 84 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

AGCTTAAGCG AGCTTATCAC CGTTTGATCT CGAGCTTAGT CACTTACTGC 
AGCTGCACCT GGGATCCGCC GCGCTTCGAT ATCG 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

lie Ser Lys Arg Gly Gly Ser Gin Val Gin Leu Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



50 
84 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Val Thr Lys Leu Glu lie Lys Arg 
1 5 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TGTCACGATC TCCTCTTAAG GGATAAGTGC CTTGGTAGTC 40 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AGTCGAATTC GATATCACAT TAGCGGATCC GGAGATCGTG ACA 43 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Val Thr lie Ser Ser 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Gly Ser Ala Asn Val lie Ser Asn Ser Thr 
1 5 . 10 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GAACTAACGA ACCGTCCATC 20 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
AATTGCGGCC GC 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 



55 



PCT/EP94/01906 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
AGGTSMARCT GCAGSAGTCW GG 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TGAGGAGACG GTGACCGTGG TCCCTTGGCC CC 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GACATTGAGC TCACCCAGTC TCCA 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

GTTTGATCTC GAGCTTGGTC CC 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.-702 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

48 

CTG CAG GAG TCA GGG GGA GGC TTA GTG CAG CCT GGA GGG TCC CGG AAA 
Leu Gin Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Arg Lys 
15 10 15 

96 

CTC TCC TGT GCA GCC TCT GGA TTC ACT TTC AGT AAC TTT GGA ATG CAC 
Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Asn Phe Gly Met His 
20 25 30 

144 

TGG GTT CGT CAG GCT CCA GAG AAG GGG CTG GAG TGG GTC GCA TAC ATT 
Trp Val Arg Gin Ala Pro Glu Lys Gly Leu Glu Trp Val Ala Tyr He 
35 40 45 

192 

AGT AGT GGC GGT ACT ACC ATC TAC TAT TCA GAC ACA ATG AAG GGC CGA 
Ser Ser Gly Gly Thr Thr He Tyr Tyr Ser Asp Thr Met Lys Gly Arg 
50 55 60 

240 

TTC ACC ATC TCC AGA GAC AAT CCC AAG AAC ACC CTG TTC CTG CAA ATG 
Phe Thr He Ser Arg Asp Asn Pro Lys Asn Thr Leu Phe Leu Gin Met 
65 70 75 80 

288 

ACC AGT CTA AGG TCT GAG GAC ACG GCC ATG TAT TTC TGT GCA AGA TCC 
Thr Ser Leu Arg Ser Glu Asp Thr Ala Met Tyr Phe Cys Ala Arg Ser 
85 90 95 

336 

TGG GCC TAT GCT ATG GAC TAC TGG GGC CAA GGG ACC ACG GTC ACC GTC 
Trp Ala Tyr Ala Met Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val 
100 105 110 

384 

TCC TCA GGT GGA GGC GGT TCA GGC GGA GGT GGC TCT GGC GGT GGC GGA 
Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 120 125 

432 

TCG GAC ATC GAG CTC ACC CAG TCT CCA TCT TAT CTT GCT GCA TCT CCT 
Ser Asp He Glu Leu Thr Gin Ser Pro Ser Tyr Leu Ala Ala Ser Pro 
130 135 140 

480 

GGA GAA ATC ATT ACT ATT AAT TGC AGG GCA AGT AAG AGT ATT AGC AAA 
Gly Glu He He Thr He Asn Cys Arg Ala Ser Lys Ser He Ser Lys 
145 150 155 160 

528 

TAT TTA GCC TGG TAT CAA GAG AAA CCT GGA AAA ACA AAT AAG CTT CTT 
Tyr Leu Ala Trp Tyr Gin Glu Lys Pro Gly Lys Thr Asn Lys Leu Leu 
165 170 175 

576 

ATC TAC TCT GGA TCC ATT TTG CAA TCT GGA ATT CCA TCA AGG TTC AGT 
He Tyr Ser Gly Ser He Leu Gin Ser Gly He Pro Ser Arg Phe Ser 
180 185 190 
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624 

GGC AGT GGA TCT GGT ACA GAT TTC ACT CTC ACC ATC ACT AGC CTG GAG 
Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Ser Leu Glu 
195 200 205 

672 

CCT GAA GAT TTT GCA ATG TAT TAG TGT CAA GAG CAT AAT GAA TAC CCG 
Pro Glu Asp Phe Ala Met Tyr Tyr Cys Gin Gin His Asn Glu Tyr Pro 
210 215 220 

702 

TGG ACG TTC GGT GGA GGG ACC AAG CTC GAG 
Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Leu Gin Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Arg Lys 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Asn Phe Gly Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Glu Lys Gly Leu Glu Trp Val Ala Tyr lie 
35 40 45 

Ser Ser Gly Gly Thr Thr lie Tyr Tyr Ser Asp Thr Met Lys Gly Arg 
50 55 60 

Phe Thr lie Ser Arg Asp Asn Pro Lys Asn Thr Leu Phe Leu Gin Met 
65 70 75 80 

Thr Ser Leu Arg Ser Glu Asp Thr Ala Met Tyr. Phe Cys Ala Arg Ser 
85 90 95 

Trp Ala Tyr Ala Met Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val 
100 105 110 

Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 120 125 

Ser Asp lie Glu Leu Thr Gin Ser Pro Ser Tyr Leu Ala Ala Ser Pro 
130 135 140 

Gly Glu lie lie Thr lie Asn Cys Arg Ala Ser Lys Ser lie Ser Lys 
145 150 155 160 

Tyr Leu Ala Trp Tyr Gin Glu Lys Pro Gly Lys Thr Asn Lys Leu Leu 
165 170 175 
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He Tyr Ser Gly Ser He Leu Gin Ser Gly He Pro Ser Arg Phe Ser 
180 185 190 

Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Glu 
195 200 205 

Pro Glu Asp Phe Ala Met Tyr Tyr Cys Gin Gin His Asn Glu Tyr Pro 
210 215 220 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CATGCAGTCT TCGGGC 1^ 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
TTAAGCCCGA AGACTG 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Asn Val He Ser Lys Arg 
1 5 
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CLAIMS 

1. A process for producing fusion proteins comprising ScFv fragments by a 
transformed mould, in which 

5 (a) the mould belongs to the genus Aspergillus, and 

(b) the Aspergillus contains a DNA sequence encoding the ScFv fragment 
under control of at least one expression and/or secretion regulating region derived 
from a mould selected from the group consisting of promoter sequences, 
terminator sequences and signal sequence-encoding DNA sequences, and 
10 functional derivatives or analogues thereof, 

optionally followed by a proteolytic cleavage step for separating the ScFv fragment 
part from the fusion protein. 

2. A process according to claim 1, in which said "at least one expression 
15 and/or secretion regulating region derived from a mould" is the combination of 

both a promoter sequence and a signal sequence-encoding DNA sequence derived 
from a glucoamylase gene ex Aspergillus plus a terminator sequence of a trpC gene 
tx Aspergillus. 

20 3. A process according to claim 1, in which said "at least one expression 
and/or secretion regulating region derived from a mould" is derived from the 
endoxylanase II gene {exlA gene) of Aspergillus niger var. awamori present on 
plasmid pAW14B. 

25 4. A process according to claim 1, in which said DNA sequence encoding 
the ScFv fragment forms part of a chimeric gene encoding a fusion protein, 
whereby said DNA sequence encoding the ScFv fragment is preceded at its 5' end 
by at least part of a structural gene encoding the mature part of a secreted mould 
protein. 

30 

5. A process according to claim 4, in which said structural gene encodes an 
endoxylanase or a glucoamylase. 
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6. A process according to claim 4, in which said ScFv fragment in the fusion 

protein is bound to said secreted mould protein or part thereof by a proteolytic 
cleavage site. 

5 7. A process according to claim 6, in which said cleavage site is a KEX2- 
like site. 

8. A process according to any one of claims 1-7, in which the mould is 
cultured under such conditions that the yield of ScFv fragment is at least 40 mg/1, 

10 preferably at least 60 mg/1, more preferably at least 90 mg/1 and still more 
preferably at least 150 mg/1. 

9. New product comprising an ScFv fragment or fusion product thereof 
obtainable by a process according to any one of claims 1-8. 

15 

10. New product according to claim 9, in which the ScFv fragment is a 
modified ScFv fragment comprising complementary determining regions (CDRs) 
grafted on the framework regions of the variable fragments of an other ScFv 
fragment that is well expressed and secreted by a lower eukaiyote. 

20 

11. New product according to claim 10, in which the lower eukaryote is a 
mould of the germs Aspergillus. 

12. Composition containing a product produced by a process as claimed in 
25 any one of claims 1-8 or a new product as claimed in any one of claims 9-11. 

13. Composition according to claim 12, which is a consumer product. 

14. Composition according to claim 12, in which the ScFv fragment 

30 recognizes a compound present in the human eco-system, which compound can be 
a microorganism, an enzyme or another protein. 
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15. Composition according to claim 14, in which the compound is present in 
the oral cavity. 

16. Composition according to claim 15, in which the compound is involved in 
5 the formation of plaque, caries, gingivitis, periodontal diseases, or bad breath. 

17. Composition according to claim 14, in which the compound is present on 
the human skin. 

10 18. Composition according to claim 17, in which the compound is involved in 
the formation of malodour, inflammation, or hair loss. 

19. Composition according to claim 14, in which the compound is a hormone, 
which composition can be used for diagnostic purposes. 

15 

20. Composition according to claim 19, in which the hormone is human 
chorionic gonadotropin (HCG). 

21. Composition according to claim 12, in which the ScFv fragment 

20 recognizes a compound present in the eco-system of domestic and agricultural 
animals which compound can be a feed component, an enzyme or another protein, 
or a disease causing agent. 

22. Composition according to claim 12, in which the ScFv fragment 

25 recognizes a compound that has a positive or negative relationship with a disease 
or disorder and can be used for detection and/or targeting purposes. 

23. Composition according to claim 12, which can be used in the chemical, 
petrol or pharmaceutical industry as catalyst or for detection purposes. 

30 

24. A process for producing fusion proteins comprising ScFv fragments by a 
transformed mould, in which 
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(a) the mould belongs to one of the genera Mucor, Neurospora, and 
Penicillium, and 

(b) the mould contains a DNA sequence encoding the ScFv fragment under 
control of at least one expression and/or secretion regulating region derived from 

5 a mould selected from the group consisting of promoter sequences, terminator 
sequences and signal sequence-encoding DNA sequences, and functional 
derivatives or analogues thereof, 

optionally followed by a proteolytic cleavage step for separating the ScFv fragment 
part from the fusion protein, 
10 whereby optionally the mould is cultured under such conditions that the yield of 
ScFv fragment is at least 40 mg/1, preferably at least 60 mg/1, more preferably at 
least 90 mg/1 and still more preferably at least 150 mg/1. 

25. New product comprising an ScFv fragment or fusion product thereof 
15 obtainable by a process according to claim 24. 

26. Composition containing a product produced by a process as claimed in 
claim 24 or a new product as claimed in claim 25. 



SUBSTITUTE SHEET (RULE 28) 



wo 94/29457 



PCT/EP94/01906 



1/7 



FIGORE 1 



7439 



7000 





NDtl 

/ 









6000 




2000 



Hindlll 
EcoRV 
Ncol 
4000 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 



PCT/EP94/01906 



2/7 



FIGURE 2 
BOOO 820B < NotI 



6000 



Xbal 




2000 



5000 
Hindlll 



PstI 
BamHI 
PstI 



Sfil < PstI 



PstI 
4000 
Ncol 



SUBSTITUTE SHEET (RULE 26) 



wo 94^9457 



3/7 



PCT/EP94/01906 



FIGURE 3 
9300 



7000 




5000 NcQl 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 



4/7 



PCT/EP94/01906 



FIGOHE A 



10044 



8000 



Xbal 



Hindlll 



10000 




2000 



— Sail 



BamHI 



6000 
BamHI 
EcoRV 
BamHI 



4000 
Ncol 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 



5/7 



PCT/EP94/01906 




SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 6 /7 PCT/EP94/01906 




SUBSTITUTE SHEET (RULE 26) 



wo 94/29457 7/ 7 PCT/EP94/01906 




SUBSTITUTE SHEET (RULE 26) 



